
Generalized Neural Network Module Expression Generator
Source:R/torch-code-gen.R
nn_module_generator.Rdnn_module_generator() is a generalized function that generates neural network
module expressions for various architectures. It provides a flexible framework for creating
custom neural network modules by parameterizing layer types, construction arguments, and
forward pass behavior.
While designed primarily for {torch} modules, it can work with custom layer implementations
from the current environment, including user-defined layers like RBF networks, custom
attention mechanisms, or other novel architectures.
This function serves as the foundation for specialized generators like ffnn_generator()
and rnn_generator(), but can be used directly to create custom architectures.
Usage
nn_module_generator(
nn_name = "nnModule",
nn_layer = NULL,
out_nn_layer = NULL,
nn_layer_args = list(),
layer_arg_fn = NULL,
forward_extract = NULL,
before_output_transform = NULL,
after_output_transform = NULL,
last_layer_args = list(),
hd_neurons,
no_x,
no_y,
activations = NULL,
output_activation = NULL,
bias = TRUE,
eval = FALSE,
.env = parent.frame(),
...
)Arguments
- nn_name
Character string specifying the name of the generated neural network module class. Default is
"nnModule".- nn_layer
The type of neural network layer to use. Can be specified as:
NULL(default): Usesnn_linear()from{torch}Character string: e.g.,
"nn_linear","nn_gru","nn_lstm","some_custom_layer"Named function: A function object that constructs the layer
Anonymous function: e.g.,
\() nn_linear()orfunction() nn_linear()
The layer constructor is first searched in the current environment, then in parent environments, and finally falls back to the
{torch}namespace. This allows you to use custom layer implementations alongside standard torch layers.- out_nn_layer
Default
NULL. If supplied, it forces to be the neural network layer to be used on the last layer. Can be specified as:Character string, e.g.
"nn_linear","nn_gru","nn_lstm","some_custom_layer"Named function: A function object that constructs the layer
Formula interface, e.g.
~torch::nn_linear,~some_custom_layer
Internally, it almost works the same as
nn_layerparameter.- nn_layer_args
Named list of additional arguments passed to the layer constructor specified by
nn_layer. These arguments are applied to all layers. For layer-specific arguments, uselayer_arg_fn. Default is an empty list.- layer_arg_fn
Optional function or formula that generates layer-specific construction arguments. Can be specified as:
Formula:
~ list(input_size = .in, hidden_size = .out)where.in,.out,.i, and.is_outputare availableFunction:
function(i, in_dim, out_dim, is_output)with signature as before
The formula/function should return a named list of arguments to pass to the layer constructor. Available variables in formula context:
.iori: Integer, the layer index (1-based).inorin_dim: Integer, input dimension for this layer.outorout_dim: Integer, output dimension for this layer.is_outputoris_output: Logical, whether this is the final output layer
If
NULL, defaults to FFNN-style arguments:list(in_dim, out_dim, bias = bias).- forward_extract
Optional formula or function that processes layer outputs in the forward pass. Useful for layers that return complex structures (e.g., RNNs return
list(output, hidden)). Can be specified as:Formula:
~ .[[1]]or~ .$outputwhere.represents the layer outputFunction:
function(expr)that accepts/returns a language object
Common patterns:
Extract first element:
~ .[[1]]Extract named element:
~ .$outputExtract with method:
~ .$get_output()
If
NULL, layer outputs are used directly.- before_output_transform
Optional formula or function that transforms input before the output layer. This is applied after the last hidden layer (and its activation) but before the output layer. Can be specified as:
Formula:
~ .[, .$size(2), ]where.represents the current tensorFunction:
function(expr)that accepts/returns a language object
Common patterns:
Extract last timestep:
~ .[, .$size(2), ]Flatten:
~ .$flatten(start_dim = 1)Global pooling:
~ .$mean(dim = 2)Extract token:
~ .[, 1, ]
If
NULL, no transformation is applied.- after_output_transform
Optional formula or function that transforms the output after the output layer. This is applied after
self$out(x)(the final layer) but before returning the result. Can be specified as:Formula:
~ .$mean(dim = 2)where.represents the output tensorFunction:
function(expr)that accepts/returns a language object
Common patterns:
Global average pooling:
~ .$mean(dim = 2)Squeeze dimensions:
~ .$squeeze()Reshape output:
~ .$view(c(-1, 10))Extract specific outputs:
~ .[, , 1:5]
If
NULL, no transformation is applied.- last_layer_args
Optional named list or formula specifying additional arguments for the output layer only. These arguments are appended to the output layer constructor after the arguments from
layer_arg_fn. Can be specified as:Formula:
~ list(kernel_size = 2L, bias = FALSE)Named list:
list(kernel_size = 2L, bias = FALSE)
This is useful when you need to override or add specific parameters to the final layer without affecting hidden layers. For example, in CNNs you might want a different kernel size for the output layer, or in RNNs you might want to disable bias in the final linear projection. Arguments in
last_layer_argswill override any conflicting arguments fromlayer_arg_fnwhen.is_output = TRUE. Default is an empty list.- hd_neurons
Integer vector specifying the number of neurons (hidden units) in each hidden layer. The length determines the number of hidden layers in the network. Must contain at least one element.
- no_x
Integer specifying the number of input features (input dimension).
- no_y
Integer specifying the number of output features (output dimension).
- activations
Activation function specifications for hidden layers. Can be:
NULL: No activation functions appliedCharacter vector: e.g.,
c("relu", "sigmoid", "tanh")activation_specobject: Created usingact_funs(), which allows specifying custom arguments. See examples.
If a single activation is provided, it will be replicated across all hidden layers. Otherwise, the length should match the number of hidden layers.
- output_activation
Optional activation function for the output layer. Same format as
activations, but should specify only a single activation. Common choices include"softmax"for classification or"sigmoid"for binary outcomes. Default isNULL(no output activation).- bias
Logical indicating whether to include bias terms in layers. Default is
TRUE. Note that this is passed tolayer_arg_fnif provided, so custom layer argument functions should handle this parameter appropriately.- eval
Logical indicating whether to evaluate the generated expression immediately. If
TRUE, returns an instantiatednn_moduleclass that can be called directly (e.g.,model()). IfFALSE(default), returns the unevaluated language expression that can be inspected or evaluated later witheval(). Default isFALSE.- .env
Default is
parent.frame(). The environment in which the generated expression is to be evaluated- ...
Additional arguments passed to layer constructors or for future extensions.
Value
If eval = FALSE (default): A language object (unevaluated expression) representing
a torch::nn_module definition. This expression can be evaluated with eval() to
create the module class, which can then be instantiated with eval(result)() to
create a model instance.
If eval = TRUE: An instantiated nn_module class constructor that can be called
directly to create model instances (e.g., result()).