Activation functions. Sparking Neurons to Life: The Unsung Heroes of AI
Unlocking hidden patterns, mastering complex tasks, and even fueling creativity — artificial intelligence is transforming our world with its relentless power. But behind these feats lies a subtle yet crucial ingredient: activation functions.
These unassuming mathematical gatekeepers decide which neurons fire, and which remain silent, shaping the very flow of information within neural networks. Today, we’ll dive into their realm, exploring common types, TensorFlow code examples, novel possibilities, and strategies for optimization.
TensorFlow Code Example for MNIST Classification
We will use the below code to test the accuracy of different activation functions using the minist dataset numbers classification using CPU.
import tensorflow as tf
# Load and preprocess MNIST dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
# Define model with a chosen activation function
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'), # Replace 'relu' with desired function
tf.keras.layers.Dense(10, activation='softmax')
])
# Compile and train model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
# Evaluate model on test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)
Ready to ignite your neural networks? Let’s dive in!
Common Activation Functions: Your Neural Toolkit
Sigmoid: This classic S-shaped function excels in binary classification tasks, offering smooth transitions between 0 and 1.
领英推荐
Tanh: Similar to Sigmoid, but centered at zero, making it well-suited for tasks where output ranges around zero are meaningful.
ReLU (Rectified Linear Unit): The popular choice for deep learning, ReLU efficiently addresses vanishing gradients and offers faster training.
Leaky ReLU: A variant of ReLU that allows small, non-zero gradients for negative inputs, potentially mitigating dead neurons.
Continue Reading on Medium :