A Brief Introduction to Artificial Intelligence (AI) and Machine Learning

A Brief Introduction to Artificial Intelligence (AI) and Machine Learning

Hi there,

I was studying and I stopped to think a little. It would be great if people understood how Elementary and High School help us to understand how modern problems are solved. But most of the time, school unfortunately does not help with that. Well, at least we learn about Galileo and his test to show that gravity works mostly the same way to all things, about Archimedes and his "Eureka" moment (when a body is immersed in a liquid, it experiences an upward buoyant force, which is equal to the weight of the liquid displaced by the body), and even about when Johann Bernoulli proposed a challenge to the scientific community asking which was the?curve of fastest descent, which was not a straight or polygonal line, but a cycloid (it was the brachistochrone). I think everyone is already aware of the news that Artificial Intelligence (AI) is bringing to the most varied areas of business. In the past, we thought that AI was only related to Google search or the famous Microsoft Word paper clip in Windows 98. Well, I would like to tell you a little about how you can make a machine learn (Machine Learning) and how it continues to learn on its own after that (Deep Learning) until learning to write books or paint pictures similarly to ancient authors.


The brachistochron. Five mathematicians responded with solutions to Bernoulli's challenge: Newton, Jakob Bernoulli, Gottfried Leibniz, Ehrenfried Walther von Tschirnhaus and Guillaume de l'H?pital.
AI Clip from Microsoft Word 98. If you remember it, you're getting old.


Ok, first I want to talk about the simplest form of Machine Learning: Linear Regression. Well, the idea is simple. Imagine two sets of numbers, X and Y, with n values each. First, you take 80% of the values of X and Y related to each other. Imagine that for each value in X, there is a corresponding value approximately 2 times the value in Y itself. This is a pattern observed with the "training" values", or "training set". You then use this training pattern to see if it can predict the results of the remaining 20% of values (called test values). If the pattern made with the training values can predict the test values made with the test values with good accuracy (the % of success can be different for each scientific area), then you have a good model. Now, how can our Machine Learning (ML) transform into Deep Learning? Well, using our linear regression as an example, we can always add values and redo the test automatically. At least in theory, the probabilities of values are adjusted until prediction errors are minimized as much as possible.


AI, ML DL.
Linear and Logistic Regressions.


Okay, linear regression is a way of separating data. But, if the data has non-linear patterns, how can we separate them and make good predictions of relationships between values from sets X and Y. For example, if we consider a logistic regression, we can separate data both linearly and in the form of curve. Perhaps this (logistic) regression is the simplest (and also least powerful) way to, for example, based on "body size" or "coloring" values, etc., predict whether an individual corresponds to a dog or a cat.


Using a logistic regression we can predict whether an image corresponds to a dog or a cat.


However, there are more complex ways of classifying or grouping values. For example, decision trees are models that remove or add a small level of complexity at each step of the tree. This is brilliant. Not only because it shows us what the most important steps are for a set of data, but also because this is perhaps the simplest (and still very elegant) way to look for true cause and effect relationships in the data (yes, there is still a lot of philosophical discussion about how we understand these relationships from our data). Going a little past the decision tree, we reach a higher step: Random Forest. It's even funny, because a random forest is a randomization of decision tree steps (that is, several random trees, hence the name forest). We can put a computer to think for itself and figure out the best way to separate data. Then, we just use the resulting solution to predict more data.


An example of a complex relationship between values corresponding to red squares or blue circles. We can think of red squares as dogs, while blue circles as cats.


Visual representation of a logistic regression, a decision tree, and a Random Forest. Complexity goes up from left to right.


Anyway, what is the best way to make a computer think? Well... for me, the best way is the simplest. If we can use a linear regression instead of a Random Tree, then let's go with regression. Now, imagine how people make climate predictions. There are several different variables, each presenting its own chaotic form of distribution. Therefore, it may be interesting to use a Random Tree when predicting the climate, or the Grinnellian (environmental) niche of a species. In fact, the Random Tree solution form is part of the newest form of Statistics we have, known as "Stochastic Statistics". Basically, it is said that we take all the trees and create an "Ensemble". From this Ensemble, we can, for example, predict, from different perspectives, whether a species could be found in a biome such as the Cerrado, or the Amazon, in the coming years.?This is why some Professors call this "Environmental or Ecological Ensemble" as "BioEnsemble"?(I have a tutorial on how to do a BioEnsemble here:?https://rfunctions.blogspot.com/2021/04/bioensembles-r-sdm-package-modelling.html).


Partitioning and mapping uncertainties in ensembles of forecasts of species turnover under climate change (wiley.com)

Well, "BioEnsembles" are being developed and/or tested since 2000s by these guys (and many others).


The theory of AI has been developed since the 60s, when the Perceptron Model was developed by a psychologist to "emulate" how a neuron work. In that manner, we could have some input values (x1, x2, x3), each one with its own weight (w1, w2, w3). From that, a chosen function would "magically" transform these inputs and weights into an output value (e.g. 0 or 1). When we add multiple layers, each one with its own inputs and outputs, we call this model an Artificial Neural Network.


Perceptron model used in AI studies.


An example of Artificial Neural Network.


A very cool image that compares a neuron to a perceptron.


Ok, now let's go into Python and develop our own model to predict whether images corresponds to dogs or cats!


First, let us enter colab website (https://colab.research.google.com/) to use Python in Google Computers connected through the internet (cloud).

Create a new notebook. Then, first, go to "execution environment" or "runtime", and then "Change runtime type". Choose Python3 and T4 GPU. This will make you work with the aid of a graphics card (from the cloud, not yours).

Now, begin editing your notebook (if you are used to Python, it is similar to Jupyter Notebooks from Anaconda). First, let us load some packages.

import tensorflow as tf
import tensorflow_datasets as tfds
import math
import matplotlib.pyplot as plt
import cv2
import numpy as np
from PIL import Image        

Now, take a look at the website corresponding to the dataset we are going to use. This dataset has thousands of pictures of cats and dogs.

https://www.tensorflow.org/datasets/catalog/cats_vs_dogs

We will now load these data and show some examples:

datos, metadatos = tfds.load("cats_vs_dogs", as_supervised=True, with_info=True)
tfds.show_examples(datos["train"], metadatos)        
Examples of pictures of cats and dogs.


Now we will change all figures so that they follow the same pattern. We will make them 100x100 and with no color (only levels of gray).

plt.figure(figsize=(20,20))

TAMANO_IMG=100

for i, (imagen, etiqueta) in enumerate(datos["train"].take(25)):
  imagen = cv2.resize(imagen.numpy(), (TAMANO_IMG, TAMANO_IMG))
  imagen = cv2.cvtColor(imagen, cv2.COLOR_BGR2GRAY)
  plt.subplot(5, 5, i+1)
  plt.imshow(imagen, cmap="gray")        
Examples of standardized pictures of cats and dogs.

Ok, now let's prepare and normalize data.

datos_entrenamiento = []
for i, (imagen, etiqueta) in enumerate(datos["train"]):
  imagen = cv2.resize(imagen.numpy(), (TAMANO_IMG, TAMANO_IMG))
  imagen = cv2.cvtColor(imagen, cv2.COLOR_BGR2GRAY)
  imagen = imagen.reshape(TAMANO_IMG, TAMANO_IMG, 1)
  datos_entrenamiento.append([imagen, etiqueta])        
X = []
y = []

for imagen, etiqueta in datos_entrenamiento:
  X.append(imagen)
  y.append(etiqueta)        
X = np.array(X).astype(float) / 255
y=np.array(y)        

And this will be our model (see the "sigmoid"? That means we will be using a Logistic Regression):

modelo = tf.keras.models.Sequential([

    tf.keras.layers.Conv2D(32, (3,3), activation="relu", input_shape=(100,100,1)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(64, (3,3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(128, (3,3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(100, activation="relu"),
    tf.keras.layers.Dense(1, activation="sigmoid")
])        

This will add some rules to the model:

modelo.compile(
    optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"]
)        

Now let us fit (train) the model! 85% of data is used for training, while 15% is used for testing.

modelo.fit(
    X, y, batch_size=32, validation_split=0.15,epochs=50
)        

Ok, now this might take a few minutes. You will end up with a trained model that can be used to predict new images of cats and dogs.

def categorizar(ruta):

  img = Image.open(ruta)
  img = img.convert('L')
  img = np.array(img).astype(float)/255

  img = cv2.resize(img,(100,100))
  prediccion = modelo.predict(img.reshape(-1, 100, 100, 1))
  return prediccion[0][0] > 0.5        

This is our prediction function!


Download these images to your desktop and rename them to simple names, like cat1, cat2, dog1, dog2:

Now, do you see this left tab ("files") that can be opened? So, drag and drop the four images (cats and dogs) you just downloaded into this tab.


Finally, in ruta (which basically means Route, or in this case URL), write?"/content/dog1.jpg", or "/content/cat1.jpg".

ruta="/content/dog1.jpg"
prediccion=categorizar(ruta)
print(prediccion)        

You will see that cats correspond to "False", while dogs correspond to "True".


Great, you now have a basic understanding of Artificial Intelligence. You now know that Python is a good programming language to work with Machine Learning, you know what a perceptron and an artificial neural network are, you know about Google Colab Website and how to use Python in it, and know an easy workflow to create a model that differentiates two types of things.


That's it for now!

Till' next time

Maurício Vancine

Postdoctoral researcher in spatial ecology

7 个月

Olá José Hidasi-Neto, ótimo o texto. Recomendo este artigo como complemento: Machine learning and deep learning—A review for ecologists https://onlinelibrary.wiley.com/doi/abs/10.1111/2041-210X.14061

要查看或添加评论,请登录

José Hidasi-Neto的更多文章

  • Modeling Species Distributions in R

    Modeling Species Distributions in R

    Hello there! My name is José Hidasi Neto and I'm here with a both theoretical and practical class in Ecological…

    1 条评论
  • Introdu??o à Bioestatística (curso completo)

    Introdu??o à Bioestatística (curso completo)

    Pessoal, tudo bem? Aqui está meu curso completo de bioestatística. S?o 14 vídeos que falam desde amostragem até testes…

  • Phylogenetic Comparative Methods (PCMs) in R

    Phylogenetic Comparative Methods (PCMs) in R

    Hi there, Today we are going to see a little about Phylogenetic Comparative Methods (PCMs). First, we'll see what PCMs…

  • astRaRium2: An R Game About Linking Stars

    astRaRium2: An R Game About Linking Stars

    Hi there, Today I have a game I created in R for you. I named it astRaRium, and it's Puzzle game about identifying and…

  • The Rise of Human Singularity

    The Rise of Human Singularity

    Written by: José Hidasi Neto. Biologist and PhD in Ecology and Evolution.

  • Emissários do Céu: O Enigma das Aves Mensageiras

    Emissários do Céu: O Enigma das Aves Mensageiras

    1. Mensagens nas Penas Em uma pequena vila na ásia, os habitantes come?aram a notar algo peculiar nas aves que chegavam…

社区洞察

其他会员也浏览了