Custom Activation Function • kindling

Rationale

The biggest strength of kindling when modelling neural networks is its versatility — it inherits torch’s versatility while being human friendly, including the ability to apply custom optimizer functions, loss functions, and per-layer activation functions. Learn more: https://kindling.joshuamarie.com/articles/special-cases.

With act_funs(), you are not limited to the activation functions available in torch’s namespace. Use new_act_fn() to wrap any compatible function into a validated custom activation. This feature, however, is only available on version 0.3.0 and above.

Function to use

To do this, use new_act_fn(). It takes a user-supplied function, validates it against a small dummy tensor at definition time (a dry-run probe), and wraps it in a call-time type guard. This means errors surface early — before your model ever starts training.

The function you supply must:

Accept at least one argument (the input tensor).
Return a torch_tensor.

Basic Usage

Currently, nnf_tanh doesn’t exist in the torch namespace, so tanh is not a valid argument to act_funs(). With new_act_fn(), you can wrap torch::torch_tanh() to make it usable.

Here’s a basic example that wraps torch::torch_tanh() as a custom activation:

hyper_tan = new_act_fn(\(x) torch::torch_tanh(x))

You can also pass it directly into act_funs(), just like any built-in activation:

act_funs(relu, elu, new_act_fn(\(x) torch::torch_tanh(x)))

Using Custom Activations in a Model

Naturally, functions for modelling, like ffnn(), accept act_funs() into the activations argument. Again, you can pass a custom activation function within new_act_fn(), then pass it through act_funs().

Here’s a basic example:

model = ffnn(
    Sepal.Length ~ .,
    data = iris[, 1:4],
    hidden_neurons = c(64, 32, 16),
    activations = act_funs(
        relu,
        silu,
        new_act_fn(\(x) torch::torch_tanh(x))
    ),
    epochs = 50
)
model

## 
## ======================= Feedforward Neural Networks (MLP) ======================
## 
## 
## -- FFNN Model Summary ----------------------------------------------------------

## -------------------------------------------------------------------
##   NN Model Type           :         FFNN    n_predictors :      3
##   Number of Epochs        :           50    n_response   :      1
##   Hidden Layer Units      :   64, 32, 16    reg.         :   None
##   Number of Hidden Layers :            3    Device       :    cpu
##   Pred. Type              :   regression                 :       
## -------------------------------------------------------------------
## 
## 
## 
## -- Activation function ---------------------------------------------------------

## -------------------------------------------------
##   1st Layer {64}    :                      relu
##   2nd Layer {32}    :                      silu
##   3rd Layer {16}    :                  <custom>
##   Output Activation :   No act function applied
## -------------------------------------------------

Each element of act_funs() corresponds to one hidden layer, in order. Here, the first hidden layer uses ReLU, the second uses SiLU (Swish), and the third uses Tanh.

You can also use a single custom activation recycled across all layers:

ffnn(
    Sepal.Length ~ .,
    data = iris[, 1:4],
    hidden_neurons = c(64, 32),
    activations = act_funs(new_act_fn(\(x) torch::torch_tanh(x))),
    epochs = 50
)

## 
## ======================= Feedforward Neural Networks (MLP) ======================
## 
## 
## -- FFNN Model Summary ----------------------------------------------------------

## -------------------------------------------------------------------
##   NN Model Type           :         FFNN    n_predictors :      3
##   Number of Epochs        :           50    n_response   :      1
##   Hidden Layer Units      :       64, 32    reg.         :   None
##   Number of Hidden Layers :            2    Device       :    cpu
##   Pred. Type              :   regression                 :       
## -------------------------------------------------------------------
## 
## 
## 
## -- Activation function ---------------------------------------------------------

## -------------------------------------------------
##   1st Layer {64}    :                  <custom>
##   2nd Layer {32}    :                  <custom>
##   Output Activation :   No act function applied
## -------------------------------------------------

Skipping the Dry-Run Probe

By default, new_act_fn() runs a quick dry-run with a small dummy tensor to validate your function before training. You can disable this with probe = FALSE, though this is generally not recommended:

my_act = new_act_fn(\(x) torch::torch_tanh(x), probe = FALSE)

Naming Your Custom Activation

You can provide a human-readable name via .name, which is used in print output and diagnostics:

my_act = new_act_fn(\(x) torch::torch_tanh(x), .name = "my_tanh")

Here’s a simple application:

ffnn(
    Sepal.Length ~ .,
    data = iris[, 1:4],
    hidden_neurons = c(64, 32),
    activations = act_funs(
        relu, 
        new_act_fn(\(x) torch::torch_tanh(x), .name = "hyper_tanh")
    ),
    epochs = 50
)

## 
## ======================= Feedforward Neural Networks (MLP) ======================
## 
## 
## -- FFNN Model Summary ----------------------------------------------------------

## -------------------------------------------------------------------
##   NN Model Type           :         FFNN    n_predictors :      3
##   Number of Epochs        :           50    n_response   :      1
##   Hidden Layer Units      :       64, 32    reg.         :   None
##   Number of Hidden Layers :            2    Device       :    cpu
##   Pred. Type              :   regression                 :       
## -------------------------------------------------------------------
## 
## 
## 
## -- Activation function ---------------------------------------------------------

## -------------------------------------------------
##   1st Layer {64}    :                      relu
##   2nd Layer {32}    :                hyper_tanh
##   Output Activation :   No act function applied
## -------------------------------------------------

Error Handling

new_act_fn() is designed to fail loudly and early. Common errors include:

Function returns a non-tensor. This will error at definition time:

new_act_fn(\(x) as.numeric(x))

## Error in `.assert_tensor_output()`:
## ! Dry-run must be a <torch_tensor>.
## ✖ Got <numeric>.
## ℹ Ensure your function returns the result of a torch operation.

Function accepts no arguments. This will error immediately:

new_act_fn(function() torch::torch_zeros(2))

## Error in `new_act_fn()`:
## ! `fn` must accept at least one argument (the input tensor).
## ℹ Use a lambda like `\(x) torch::torch_tanh(x)`.

These checks ensure your model’s architecture is valid before any data ever flows through it.

Summary

Feature	Details
Wraps any R function	Must accept a tensor, return a tensor
Dry-run probe	Validates at definition time (`probe = TRUE` by default)
Call-time guard	Type-checks output on every forward pass
Compatible with `act_funs()`	Use alongside built-in activations freely
Closures supported	Parametric activations work naturally