Activation functions. Sparking Neurons to Life: The Unsung Heroes of AI

Activation functions. Sparking Neurons to Life: The Unsung Heroes of AI

Unlocking hidden patterns, mastering complex tasks, and even fueling creativity — artificial intelligence is transforming our world with its relentless power. But behind these feats lies a subtle yet crucial ingredient: activation functions.

These unassuming mathematical gatekeepers decide which neurons fire, and which remain silent, shaping the very flow of information within neural networks. Today, we’ll dive into their realm, exploring common types, TensorFlow code examples, novel possibilities, and strategies for optimization.

TensorFlow Code Example for MNIST Classification

We will use the below code to test the accuracy of different activation functions using the minist dataset numbers classification using CPU.

import tensorflow as tf
# Load and preprocess MNIST dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# Define model with a chosen activation function
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'), # Replace 'relu' with desired function
  tf.keras.layers.Dense(10, activation='softmax')
])
# Compile and train model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
# Evaluate model on test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)        

Ready to ignite your neural networks? Let’s dive in!

Common Activation Functions: Your Neural Toolkit

Sigmoid: This classic S-shaped function excels in binary classification tasks, offering smooth transitions between 0 and 1.

  • TensorFlow code: tf.keras.activations.sigmoid.
  • Training time: 11.8179 seconds
  • Inference time: 0.4733 seconds
  • Accuracy:96.82%
  • Energy: 0.000351456782 kWh
  • CO2eq: 0.136723400984 g
  • This is equivalent to 0.001271845591 km traveled by car.

Tanh: Similar to Sigmoid, but centered at zero, making it well-suited for tasks where output ranges around zero are meaningful.

  • TensorFlow code: tf.keras.activations.tanh
  • Training time: 11.2932 seconds
  • Inference time: 0.3907 seconds
  • Accuracy:97.15%
  • Energy: 0.000364852708 kWh
  • CO2eq: 0.141934672160 g
  • This is equivalent to 0.001320322532 km traveled by car.

ReLU (Rectified Linear Unit): The popular choice for deep learning, ReLU efficiently addresses vanishing gradients and offers faster training.

  • TensorFlow code: tf.keras.activations.relu
  • Training time: 17.1945 seconds
  • Inference time: 0.6643 seconds
  • Accuracy:97.00%
  • Energy: 0.000350733862 kWh
  • CO2eq: 0.136442171310 g
  • This is equivalent to 0.001269229501 km traveled by car.

Leaky ReLU: A variant of ReLU that allows small, non-zero gradients for negative inputs, potentially mitigating dead neurons.

  • TensorFlow code: tf.keras.layers.LeakyReLU
  • Training time: 17.6417 seconds
  • Inference time: 0.5864 seconds
  • Accuracy:97.05%
  • Energy: 0.000496290336 kWh
  • CO2eq: 0.193066419750 g
  • This is equivalent to 0.001795966695 km traveled by car.


Continue Reading on Medium :

Activation functions. Sparking Neurons to Life: The Unsung Heroes of AI | by Sherif Awad - Head of Digital Strategy @Holcim MEA | Jan, 2024 | Medium

要查看或添加评论,请登录

社区洞察

其他会员也浏览了