amber.modeler

Rationale

This module amber.modeler provides class and interfaces to convert an architecture (usually a list of tokens/strings) to a model. For simple, sequential models, it’s sufficient to use the out-of-box tf.keras.Sequential for implementations. However, more classes are needed for advanced architectures, such as conversion of enas super-net to sub-nets.

On the high level, we first need an analog of tf.keras.Sequential that returns a model object when called; in AMBER, this is amber.modeler.ModelBuilder and its subclasses. To wrap around different implementations of neural networks (e.g., a sequential keras model vs. a sub-net of enas implemented in tensorflow), ModelBuilder will take amber.architect.ModelSpace as the unifying reference of model architectures, so that different implementations frameworks, like tensorflow vs keras vs pytorch, will look the same to the search algorithms in amber.architect to ease its burden.

Moving one level further, we need an analog of tf.keras.Model to facilitate the training and evaluation as class methods. This is implemented by amber.modeler.child.

Under the hood of child models, the corresponding tensor operations and computation graphs are constructed in module amber.modeler.dag. Currently AMBER builds the enas sub-graphs with keras models, and builds branching keras model and multi-input/output keras model. Next steps include construction of pytorch computation graphs.

Model Builders

enasModeler

class DAGModelBuilder(inputs_op, output_op, model_space, model_compile_dict, num_layers=None, with_skip_connection=True, with_input_blocks=True, dag_func=None, *args, **kwargs)[source]

Bases: amber.modeler.enasModeler.ModelBuilder

class EnasAnnModelBuilder(session=None, controller=None, dag_func='EnasAnnDAG', l1_reg=0.0, l2_reg=0.0, with_output_blocks=False, use_node_dag=True, feature_model=None, dag_kwargs=None, *args, **kwargs)[source]

Bases: amber.modeler.enasModeler.DAGModelBuilder

This function builds a feed-forward neural net (FFNN).

It uses tensorflow low-level API to define a big graph, where each child network architecture is a subgraph in this big DAG.

Parameters
  • session (tf.Session) – tensorflow session for building enas DAG

  • controller (amber.architect.MultiIOController) – controller instance

  • dag_func (str) – string name for DAG to use

  • l1_reg (float) – regularizer strength for L1

  • l2_reg (float) – regularizaer strength for L2

  • with_output_blocks (bool) – if True, add another architecture representation vector, to connect intermediate layers to output blocks.

  • use_node_dag (bool) – if True, use another amber.modeler.InputBlockDAG to represent the computation graph

  • feature_model (tf.keras.Model, or None) – If specified, use the provided upstream model for pre-transformations of inputs, instead of taking the raw input features.

  • dag_kwargs (dict, or None) – keyword arugments passed to initializing DAG

set_controller(controller)[source]
class EnasCnnModelBuilder(session=None, controller=None, dag_func='EnasConv1DDAG', l1_reg=0.0, l2_reg=0.0, batch_size=None, dag_kwargs=None, *args, **kwargs)[source]

Bases: amber.modeler.enasModeler.DAGModelBuilder

set_controller(controller)[source]
class ModelBuilder(inputs, outputs, *args, **kwargs)[source]

Bases: object

Scaffold of Model Builder

kerasModeler

class KerasBranchModelBuilder(inputs_op, output_op, model_compile_dict, model_space=None, with_bn=False, **kwargs)[source]

Bases: amber.modeler.enasModeler.ModelBuilder

class KerasModelBuilder(inputs_op, output_op, model_compile_dict, model_space=None, gpus=None, **kwargs)[source]

Bases: amber.modeler.enasModeler.ModelBuilder

class KerasMultiIOModelBuilder(inputs_op, output_op, model_compile_dict, model_space, with_input_blocks, with_output_blocks, dropout_rate=0.2, wsf=1, **kwargs)[source]

Bases: amber.modeler.enasModeler.ModelBuilder

Note:

Still not working if num_outputs=0

class KerasResidualCnnBuilder(inputs_op, output_op, fc_units, flatten_mode, model_compile_dict, model_space, dropout_rate=0.2, wsf=1, add_conv1_under_pool=True, verbose=1, **kwargs)[source]

Bases: amber.modeler.enasModeler.ModelBuilder

Function class for converting an architecture sequence tokens to a Keras model

Parameters
  • inputs_op (amber.architect.modelSpace.Operation)

  • output_op (amber.architect.modelSpace.Operation)

  • fc_units (int) – number of units in the fully-connected layer

  • flatten_mode ({‘GAP’, ‘Flatten’}) – the flatten mode to convert conv layers to fully-connected layers.

  • model_compile_dict (dict)

  • model_space (amber.architect.modelSpace.ModelSpace)

  • dropout_rate (float) – dropout rate, must be 0<dropout_rate<1

  • wsf (int) – width scale factor

static factorized_reduction_layer(inp, out_filter, name, reduction_factor=4)[source]
static get_out_filters(model_space)[source]
static res_layer(layer, width_scale_factor, inputs, l2_reg=5e-07, name='layer', add_conv1_under_pool=True)[source]
build_multi_gpu_sequential_model(model_states, input_state, output_state, model_compile_dict, gpus=4, **kwargs)[source]
build_multi_gpu_sequential_model_from_string(model_states_str, input_state, output_state, state_space, model_compile_dict)[source]

build a sequential model from a string of states

build_sequential_model(model_states, input_state, output_state, model_compile_dict, **kwargs)[source]
Parameters
  • model_states (a list of _operators sampled from operator space)

  • input_state

  • output_state (specifies the output tensor, e.g. Dense(1, activation=’sigmoid’))

  • model_compile_dict (a dict of loss, optimizer and metrics)

Return type

Keras.Model

build_sequential_model_from_string(model_states_str, input_state, output_state, state_space, model_compile_dict)[source]

build a sequential model from a string of states

Child Models: Training Interface

Child model classes wrapped above Keras.Model API for more complex child network manipulations

class DenseAddOutputChild(nodes=None, block_loss_mapping=None, *args, **kwargs)[source]

Bases: amber.modeler.child.GeneralChild

compile(*args, **kwargs)[source]
evaluate(x, y, final_only=True, *args, **kwargs)[source]
fit(x, y, *args, **kwargs)[source]
predict(x, final_only=True, *args, **kwargs)[source]
class EnasAnnModel(inputs, outputs, arc_seq, dag, session, dropouts=None, name='EnasModel')[source]

Bases: object

compile(optimizer, loss=None, metrics=None, loss_weights=None)[source]
evaluate(*args, **kwargs)[source]
evaluate_ph(x, y, batch_size=None, verbose=0)[source]
evaluate_pipe(x, y, batch_size=None, verbose=0)[source]
fit(x, y, batch_size=None, nsteps=None, epochs=1, verbose=1, callbacks=None, validation_data=None)[source]
fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, validation_data=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True)[source]
fit_ph(x, y, batch_size=None, nsteps=None, epochs=1, verbose=1, callbacks=None, validation_data=None)[source]
fit_pipe(x, y, batch_size=None, nsteps=None, epochs=1, verbose=1, callbacks=None, validation_data=None)[source]
get_weights(**kwargs)[source]
load_weights(filepath, **kwargs)[source]
predict(*args, **kwargs)[source]
predict_ph(x, batch_size=None)[source]
predict_pipe(x, batch_size=None, verbose=0)[source]
save(*args, **kwargs)[source]

Todo

save model architectures

save_weights(filepath, **kwargs)[source]
set_weights(weights, **kwargs)[source]
train_on_batch(x, y)[source]
class EnasCnnModel(inputs, outputs, labels, arc_seq, dag, session, dropouts=None, use_pipe=None, name='EnasModel', **kwargs)[source]

Bases: object

Todo

  • re-write weights save/load

  • use the input/output/label tensors provided by EnasConv1dDAG; this should unify the fit method when using placeholder and Tensor pipelines - probably still need two separate methods though

compile(optimizer=None, loss=None, metrics=None, loss_weights=None)[source]
evaluate(x, y, batch_size=None, verbose=0)[source]
fit(x, y, batch_size=None, nsteps=None, epochs=1, verbose=1, callbacks=None, validation_data=None)[source]
fit_generator()[source]
get_weights(**kwargs)[source]
load_weights(filepath, **kwargs)[source]
predict(x, batch_size=None, verbose=0)[source]
save(*args, **kwargs)[source]

Todo

save model architectures

save_weights(filepath, **kwargs)[source]
set_weights(weights, **kwargs)[source]
train_on_batch(x=None, y=None)[source]
class GeneralChild(*args, **kwargs)[source]

Bases: tensorflow.python.keras.engine.training.Model

DAG: Computation Graph for Child Models

represent neural network computation graph as a directed-acyclic graph from a list of architecture selections

class ComputationNode(operation, node_name, merge_op=<class 'tensorflow.python.keras.layers.merge.Concatenate'>)[source]

Bases: object

Computation Node is an analog to tf.keras.layers.Layer to make branching and multiple input/output feed-forward neural network (FFNN) models, represented by a directed-acyclic graph (DAG) in AMBER.

The reason we need ComputationNode is that an amber.architect.Operation focus on token-level computations, but does not represent the connectivity patterns well enough. When it comes to building DAG-represented FFNNs, we need more fine-grained control over the graph connectivities and validities.

This is a helper that provides building blocks for amber.modeler.DAG to use, and is not intended to be used by itself.

See also amber.modeler.dag.DAG.

Parameters
  • operation (amber.architect.Operation) – defines the operation in current layer

  • node_name (str) – name of the node

  • merge_op (tf.keras.layers.merge, optional) – operation for merging multiple inputs

build()[source]

Build the keras layer with merge operations if applicable

when building a node, its parents must all be built already.

class DAG(arc_seq, model_space, input_node, output_node, with_skip_connection=True, with_input_blocks=True, *args, **kwargs)[source]

Bases: object

Construct a feed-forward neural network (FFNN) represented by a directed acyclic graph (DAG).

While a simple, linear and sequential neural network model is also a DAG, here we are trying to build more flexible, generalizable branching models. In other words, the primary use is to construct a block-sparse FFNN, to create an specific inductive bias for a specific question; although one may use it in conv nets or other architectures stronger with inductive biases as well.

Note that we are re-using the skip connection searching algorithms designed for building residual connections, but instead use it to build inter-layer connections without the “stem” connections in a ResNet. That is, the residual connections summed to the output of the currenct layer is now concatenated as input to the layer. By construction, it is possible that a node has no input, and these nodes will be removed in _remove_disconnected_nodes().

Parameters
  • arc_seq (list, or numpy.array) – a list of integers, each is a token for neural network architecture specific to a model space

  • model_space (amber.architect.ModelSpace) – model space to sample model architectures from. Necessary for mapping token integers to operations.

  • input_node (amber.modeler.ComputationNode, or list) – a list of input layers/nodes; in case of a single input node, use a single element list

  • output_node (amber.modeler.ComputationNode, or list) – output node configuration

  • with_skip_connection (bool) – if False, disable inter-layers connections (i.e. skip-layer connections). Default is True.

  • with_input_blocks (bool) – if False, disable connecting partial inputs to intermediate layers. Default is True.

Returns

model – a constructed model using keras Model API

Return type

tf.keras.models.Model

class EnasAnnDAG(model_space, input_node, output_node, model_compile_dict, session, l1_reg=0.0, l2_reg=0.0, with_skip_connection=True, with_input_blocks=True, with_output_blocks=False, controller=None, feature_model=None, feature_model_trainable=None, child_train_op_kwargs=None, name='EnasDAG')[source]

Bases: object

EnasAnnDAG is a DAG model builder for using the weight sharing method for child models.

This class deals with the feed-forward neural network (FFNN). The weight sharing is between all Ws for different hidden units sizes - that is, a larger hidden size always includes the smaller ones.

Parameters
  • model_space (amber.architect.ModelSpace) – model space to search architectures from

  • input_node (amber.architect.Operation, or list) – one or more input layers, each is a block of input features

  • output_node (amber.architect.Operation, or list) – one or more output layers, each is a block of output labels

  • model_compile_dict (dict) – compile dict for child models

  • session (tf.Session) – tensorflow session that hosts the computation graph; should use the same session as controller for sampling architectures

  • with_skip_connection (bool) – if False, disable inter-layer connections. Default is True.

  • with_input_blocks (bool) – if False, disable connecting input layers to hidden layers. Default is True.

  • with_output_blocks (bool) – if True, add another architecture representation vector, to connect intermediate layers to output blocks.

  • controller (amber.architect.MultiIOController, or None) – connect a controller to enable architecture sampling; if None, can only train fixed architecture manually provided

  • feature_model (tf.keras.Model, or None) – If specified, use the provided upstream model for pre-transformations of inputs, instead of taking the raw input features.

  • feature_model_trainable (bool, or None) – Boolean of whether pass gradients to the feature model.

  • child_train_op_kwargs (dict, or None) – Keyword arguments passed to model.fit().

  • name (str) – a string name for this instance

connect_controller(controller)[source]
set_controller(controller)[source]
class EnasConv1DwDataDescrption(data_description, *args, **kwargs)[source]

Bases: amber.modeler.dag.EnasConv1dDAG

This is a modeler that specifiied for convolution network with data description features

class EnasConv1dDAG(model_space, input_node, output_node, model_compile_dict, session, with_skip_connection=True, batch_size=128, keep_prob=0.9, l1_reg=0.0, l2_reg=0.0, reduction_factor=4, controller=None, child_train_op_kwargs=None, stem_config=None, data_format='NWC', train_fixed_arc=False, fixed_arc=None, name='EnasDAG', **kwargs)[source]

Bases: object

set_controller(controller)[source]
class InputBlockAuxLossDAG(*args, **kwargs)[source]

Bases: amber.modeler.dag.InputBlockDAG

Add intermediate outputs whenever two input blocks first meet and merge.

Compared to InputBlockDAG, the difference is best illustrated by an example:

|Input_A  Input_B   Input_C  Input_D      |
|-------  -------   -------  -------      |
|    |      |           |      |          |
|   Hidden_AB          Hidden_CD          |
|    /       |         |      \           |
|   /        Hidden_ABCD       \          |
| add_out1      |    \        add_out2    |
|           Hidden_2  add_out3            |
|               |                         |
|             Output                      |

In amber.modeler.dag.InputBlockDAG, add_out3 will NOT be added, since only immediate layers to input blocks (i.e. Hidden_AB and Hidden_CD) will be added output.

Returns

model – a subclass of keras Model API with multiple intermediate outputs predicting the same label

Return type

amber.modeler.child.DenseAddOutputChild

class InputBlockDAG(add_output=True, *args, **kwargs)[source]

Bases: amber.modeler.dag.DAG

Add intermediate outputs to each level of network hidden layers. Based on DAG

Compared to DAG, the difference is best illustrated by an example:

|Input_A  Input_B   Input_C  Input_D      |
|-------  -------   -------  -------      |
|    |      |           |      |          |
|   Hidden_AB          Hidden_CD          |
|    /       |         |      \           |
|   /        Hidden_ABCD       \          |
| add_out1      |             add_out2    |
|            Hidden_2                     |
|               |                         |
|             Output                      |

In amber.modeler.dag.DAG, add_out1 and add_out2 will NOT be added. The loss and out1 and out2 will be the same as output, but with a lower weight of 0.1.

See also

amber.modeler.dag.DAG

the base class.

amber.modeler.dag.InputBlockAuxLossDAG

add more auxillary outputs whenever two inputs meet.

Returns

model – a subclass of keras Model API with multiple intermediate outputs predicting the same label

Return type

amber.modeler.child.DenseAddOutputChild

get_dag(arg)[source]

Getter method for getting a DAG class from a string

DAG refers to the underlying tensor computation graphs for child models. Whenever possible, we prefer to use Keras Model API to get the job done. For ENAS, the parameter-sharing scheme is implemented by tensorflow.

Parameters

arg (str or callable) – return the DAG constructor corresponding to that identifier; if is callable, assume it’s a DAG constructor already, do nothing and return it

Returns

A DAG constructor

Return type

callable

get_layer(x, state, with_bn=False)[source]

Getter method for a Keras layer, including native Keras implementation and custom layers that are not included in Keras.

Parameters
  • x (tf.keras.layers or None) – The input Keras layer

  • state (amber.architect.Operation) – The target layer to be built

  • with_bn (bool, optional) – If true, add batch normalization layers before activation

Returns

x – The built target layer connected to input x

Return type

tf.keras.layers

Architecture Decoder

Classes for breaking down an architecture sequence into a more structured format for later use

class MultiIOArchitecture(num_layers, num_inputs, num_outputs)[source]

Bases: object

decode(arc_seq)[source]
class ResConvNetArchitecture(model_space)[source]

Bases: object

decode(arc_seq)[source]

Decode a sequence of architecture tokens into operations and res-connections

encode(operations, res_con)[source]

Encode operations and residual connections to a sequence of architecture tokens

This is the inverse function for decode

Parameters
  • operations (list) – A list of integers for categorically-encoded operations

  • res_con (list) – A list of list where each entry is a binary-encoded residual connections