登录查看更多内容

3 practical examples of tricking Neural Networks using GA and FGSM. How can object classification be easily fooled?

Profil Software

Python Development Company. JS Software House. Data Science, Data Engineering, Artificial Intelligence, Machine Learning

发布日期: 2022年4月20日

Hi! I’m Przemys?aw from?AI Software Development Company?located in Northern Poland where I’m working as a Python developer. My interests in AI were raised while studying the topic of reinforcement learning and computer vision. I have a strong inner need to see how things are done under the hood so I wanted to check if I could mess with some well known object classification models such as?CNNs?(Convolutional Neural Networks). They are just a bunch of numbers and mathematical operations, so let’s see if we can play with that!

Image classification

Image classification refers to a process in computer vision that can classify an image according to its visual content. It should not be mistaken with other similar operations such as localization, object detection or segmentation. The

below image shows the difference to make sure that everything is clear:

Brak alternatywnego tekstu dla tego zdj?cia

Experiments’ description

For the purpose of this article I’ve chosen two algorithms to go through. The first one is a genetic algorithm used for?One Pixel Attack?which, as its name suggests, changes only a single pixel value to fool the classification model. The second one is?FGSM?(Fast Gradient Sign Method) which modifies an image with a little noise which is practically unseen by humans but can manipulate the model’s prediction.

One Pixel Attack

When I was searching the net to find ways to fool?DNN?(Deep Neural Network) models, I ran across the very interesting concept of?One Pixel Attack,?and I knew I needed to check it out. My intuition was telling me that changing only one pixel in the original image wouldn’t be enough to break all those concepts of filters and convolutional layers used in neural networks that do a great job when it comes to object classification.

The only information that was used to manipulate the input image was the probability of classification (percentage values for each label). The way I wanted to achieve that without a brute force method was by using GA (Genetic Algorithm). The idea was easy:

Get the true label for a given image. Draw a base population of changed pixels (encoded as?xyrgb), where?x?and?y?is the position of a pixel and?r,?g?and?b?are its color components. Do GA magic (crossing, mutation, selection) taking into account the population diversity. End calculations when the probability decreases under 20% or after a certain number of steps without appreciable results.

For the experiments I used a model based on the?VGG16?architecture for the cifar10 dataset with pretrained weights (https://github.com/geifmany/cifar-vgg). It was done like this to eliminate the impact from a ‘potentially’ badly trained model. The sample code below is presented to get a kick-start with training your own models on that dataset:

# cifar10 dataset preparation

from keras.datasets import cifar10
from keras.utils import to_categorical

cifar_10_categories = {
    0: 'airplane',
    1: 'automobile',
    2: 'bird',
    3: 'cat',
    4: 'deer',
    5: 'dog',
    6: 'frog',
    7: 'horse',
    8: 'ship',
    9: 'truck',
}

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# training and evaluation goes here

...

The results obtained from the attack were really good because for almost 20% of the images, changing only one pixel successfully led to misclassification.

FGSM

Another method I found is?FGSM?(Fast Gradient Sign Method), which is extremely easy in its concepts, but also leads to great effects. Without getting too deep into all the technical issues, this method is based on calculating the gradient (between input and output of neural net) for a given image that will increase the classification of the true label.

For an?untargeted?approach the next step is just to add the sign value of the gradient (-1, 0 or 1 for each pixel component) to an image to avoid a good prediction. Some studies also use a param called?epsilon?which is a multiplier for the sign value, but in this experiment we considered images that are represented by integer rgb values. This step can be repeated a few times to get satisfying results.

Another approach is a?targeted?attack which differs in the way the gradient is calculated. For this type of attack it is taken between the input image and target label (not true label). It is then subtracted from image to move the classification closer to the aim. Easy isn’t it? I’ve pasted some sample code below to make it easier to understand.


# sample code that calculates the gradients and updates an image

import keras.backend as K

sess = K.get_session()

...

target = K.one_hot(target_class if target_class is not None else
base_class, num_classes)

def get_image_update_function(target_class):
    def target(img, delta):
        return img - epsilon * delta

    def non_target(img, delta):
        return img + epsilon * delta

    if target_class is not None:
        return target
    return non_target

update_fun = get_image_update_function(target_class)

# calculate delta - difference noise
loss = losses.categorical_crossentropy(target, model.output)
grads = K.gradients(loss, evaluated_model.input)
delta = K.sign(grads[0])
delta = sess.run(delta, feed_dict={model.input: image})

# update image
image = update_fun(image, delta)

The model that was used in this experiment is?resnet18?with?imagenet?weights.?The sample code that enables its loading (using image-classifiers==0.2.2) is pasted below:

领英推荐

Demystifying Neural Networks with PyTorch

Rany ElHousieny, PhD??? 1 年前

Understanding Neural Networks with Google TensorFlow

Sankhyana Consultancy Services-Kenya 1 年前

Understanding deep learning models as overcoming…

Ajit Jaokar 1 年前


# loading resnet pretrained models (224x224px, 1000 classes)

from classification_models import Classifiers

ResNet18, preprocess_input = Classifiers.get('resnet18')
resnet_dim = (224, 224)
model = ResNet18(input_shape=(*resnet_dim, 3), weights='imagenet', classes=1000)

The below image presents an original and adversarial example generated using FGSM + generated noise after 2 steps of the algorithm:

Black-box FGSM

The previous method was an easy case where we have full info about the attacked model, but what about when it is not available??Here is a study?that estimates the gradient by using a large amount of queries to the target model. I tried to fool the target model using my own model that had a different architecture but did similar tasks. The modified images were prepared based on my model (it took 7 steps to decrease true label prediction under 1%) and checked by the target model (vgg16 cifar10 model used in previous steps). Results from this experiment are shown below:

These results look promising but we have to take into account that these are relatively simple tasks (classifying 32x32 pixel images), and the difficulty of fooling other models will probably grow with the complexity of the structures that are used.

Conclusion

The approaches that were presented show that we can perturb images in a way to manipulate classification results. This is easy when we have full info about model structure. Otherwise it is hard to estimate perturbed samples with limited access to the target model.

The knowledge that comes from these experiments can help to defend from such attacks by extending the training set with slightly modified images.

Resources

published at

ResearchGate:?Tricking Neural Networks

Dev.to:?Tricking Neural Networks

Thanks to Peter Plesa.?

#ImageRecognition #GenericAlgorithm #NeuralNetworks #MachineLearning

要查看或添加评论，请登录

Profil Software的更多文章

See all articles

3 practical examples of tricking Neural Networks using GA and FGSM. How can object classification be easily fooled?

Profil Software

Python Development Company. JS Software House. Data Science, Data Engineering, Artificial Intelligence, Machine Learning

领英推荐

Thanks to Peter Plesa.?

Profil Software的更多文章

社区洞察

其他会员也浏览了

Understanding Neural Networks by Building a Language Model from Scratch

ConvNext: The Return Of Convolution Networks

FIFTY Transfer Learning Models (for Deep Neural Networks) From Keras & PyTorch with Useful Links (for advanced ML Practitioners) - Shailendra Kadre

Convolutional Neural Networks For Artificial Vision

Demystifying Neural Network Types and Hyperparameter Tuning for Beginners

Graph Neural Networks: Revolutionizing AI with Structural Data

The Backpropagation Algorithm in Neural Nets is Just Linear?Algebra

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 2

领英推荐

Thanks to Peter Plesa.?

Profil Software的更多文章

How To Hire Software Developers For a Startup?

Testing a React Application with React Hooks with Jest and Enzyme for newbies

Sentiment analysis on Twitter posts part 1.

3 questions to ask when hiring a team of developers for a startup

Make your code more Pythonic with Magic Methods

Database Comparison - SQL vs. NoSQL (MySQL vs PostgreSQL vs Redis vs MongoDB)

Demystifying Python metaclasses -Why are they so special?

Python Riddle to Solve in Reasonable Time

10 Things You Need to Know to Effectively Use Django Rest Framework

社区洞察

其他会员也浏览了

Understanding Neural Networks by Building a Language Model from Scratch

ConvNext: The Return Of Convolution Networks

FIFTY Transfer Learning Models (for Deep Neural Networks) From Keras & PyTorch with Useful Links (for advanced ML Practitioners) - Shailendra Kadre

Convolutional Neural Networks For Artificial Vision

Demystifying Neural Network Types and Hyperparameter Tuning for Beginners

Graph Neural Networks: Revolutionizing AI with Structural Data

The Backpropagation Algorithm in Neural Nets is Just Linear?Algebra

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 2