I created my first AI and here is what I learned
leonardo.ai prompt: A cat and a dog looking curiously into the camera, fisheye lens, photorealistic

I created my first AI and here is what I learned

In my previous post, I shared my intention to learn AI. I aim to continue learning and, more importantly, understand how best to manage AI efforts and projects. In this article, I will share what I created and my [current] understanding of how it works.

Is this a cat or a dog?

No alt text provided for this image
Partial search results for "dogs" on my iPhone

If you own a modern phone with a camera, you can open the photos app and search for things like "house," "lake," "cat," or "dog," and the device will dutifully display the expected results.

Nowadays, we take a feature like that for granted, but just a few years ago, it seemed like magic. That left me wondering: "How does my phone do that?" I set out to learn.

Kaggle says what?

When you dig in to learn how AI works, you will be inundated with advanced mathematical terminology and curious website names. A popular site for learning, using, and understanding AI is called Kaggle .

Among other things, Kaggle offers up machine learning scenarios, usually in the form of ?? competitions. You can win real money and prizes in some, but you can also sit back and learn, like me.

A photo of a cat and a dog intently staring at each other.
A cat and a dog are intently staring at each other. This image is taken from the cats-vs-dogs Kaggle challenge website page.

In the older but still timely Dogs-vs-Cats competition, you are provided with photos representing cats and dogs. Then, you use those photos to create an algorithm that trains a machine language data model. Finally, you test the algorithm by feeding your AI photos of dogs and cats to determine if it will accurately guess the question: "Is this a cat or a dog?".

For purposes of this article, I interchangably refer to the "algorithm" as a machine language data model and AI.

I recommend checking out the excellent videos on the @sentdex YouTube channel if you want to code along on a self-paced training course.

How do you build an AI?

At a high level, here are the steps you have to take when creating the AI:

  1. ??? Collect and categorize the training data. In this case, the training data are photos of cats and dogs. I also had to categorize the files so the model knows what pictures are of cats and what are of dogs.
  2. ?? Use machine learning to train a model. Think of this as something akin to a database that can recognize patterns in the training data.
  3. ?? Use AI to test the model. Upload a photo of an animal, and it determines if the animal is (a) a cat or (b) a dog.

There is no dog

No alt text provided for this image
A famous scene from the movie The Matrix. A child teaches the main protagonist Neo to realize the truth: "that there is no spoon."

Let's pretend I am a grown adult who has never seen a dog. You take me to a dog park, point at the animals playing inside the fenced-in area, and say, "Those are dogs, Bill!". I study them for a few minutes, and somehow, my brain knows what dogs look like.

As we continued walking, I saw other dogs that weren't in the dog park, like German Shepherds, Bulldogs, and Chihuahuas. Generally speaking, I will recognize them as dogs. I might be wrong occasionally, but my brain is smart enough to get it right more often than not.

Computers do not know what a dog looks like, so you must first train them, and that is a bit more complicated than you might think.

Please take a look at the following photo and think about what you see:

No alt text provided for this image
A picture of a dog sitting on the grass with a ball in front of if

I see a happy black and brown dog chilling in a backyard, protecting a ball.

But what does the computer see? It is more likely something like the following image:

No alt text provided for this image
A stylized image of the previous dog photo highlighting the patterns a computer may see in the picture.

The machine learning model found patterns, not a dog. For example, it found patterns of houses, fences, telephone poles, grass, and the dog. Later, when you have built your AI, the data model can read a new photo of a dog it has never seen and attempt to match the patterns against the photos it was trained in.

There is, of course, much more to it than that. However, for this learning conversation, we can focus on patterns.

Now that we know the computer sees patterns and not actual animals, look at the following four images and think, "What are the chances the AI will figure out any of those are dogs assuming I only have that one picture of the dog in the backyard?"

No alt text provided for this image
Four photos of dogs in various poses.

If we asked the AI to do its absolute best, it might look at the top-left photo and say, "Yup, this is a dog." Why? Because it can look at the training image and see similarities in patterns, like those floppy ears, the nose, and the open mouth.

But what about that chain link fence in front of the dog? That does not fit a pattern. Maybe when I send the photo, the AI thinks the chain link fence is the primary characteristic and cannot make out the pattern of the dog's face because of the obstruction.

No alt text provided for this image
An example of how an AI algorithm may attempt to determine whether a photo is of a dog.

Feed me, Seymour (training the model)

No alt text provided for this image
A scene from the movie Little Shop of Horrors where a plant has an insatiable appetite.

While my grownup mind can see a few dogs for the first time and then accurately assume which other animals are dogs, AI cannot do that. Instead, it requires training data -- and lots of it.

For a machine language model to intelligently identify a photo containing a cat or a dog, it must know a lot about the patterns of cats and dogs. For example, it needs patterns of eyes, noses, facial structures, tails, and paws, to name a few. It will also need to know positional patterns, like sitting on the ground, standing, jumping in the air, chasing a laser, etc.

For my little exercise, I used 25,000 photos; still, the AI was wrong plenty of times when I tested it. Of course, I needed to fine-tune the algorithm (more on that further down this article). But ideally, I would have thousands or even millions more images to create a functionally "intelligent" AI.

Another significant element is cleansing your data, which is too large of a topic to discuss here. I will likely cover that in a future article.

Building the model

An AI generated image with the prompt "A person writes advanced and confusing software code they have never worked with before. Colorful and bright.". For some reason, the AI generated a third arm, which seems to be a theme you will notice as you keep reading.
Leonardo.ai: A person writes advanced and confusing software code they have never worked with before. Colorful and bright. Notice the third arm? This will be a theme as you read on!

You might be surprised that writing custom software to create my cats and dogs machine language learning model was only about 80 lines of code. I know I was!

However, those lines of code are unlike anything I've seen before.

I have written code that opens, edits and saves photos. I have written code that performs basic math functions. I even wrote a commercial software product that generates eBooks in various publishing formats. So yes, I know my way around programming languages, but I have never written AI code, which is very different.

Even if you never wrote software code, you are likely familiar with basic coding terminology, like `if..then..else` logic, creating variables, and the like.

Of course, all those programming skills are still needed. However, when you start writing machine-learning code, you are presented with esoteric terms like transformers, binary_crossentropy, relu, layers, sigmoid, and epochs.

I have read two machine-learning books and am still working to grasp these and many other terms and why they matter. However, it is promising that someone like me, with no experience creating AI, can sit down and write something that seemed well out of reach before I started this journey just a few weeks ago.

Testing the model

An AI-generated image of a woman working on multiple monitors testing and validating data. For some reason, the AI generated a third hand.
Leonardo.ai: A female data scientist performing tests and validating data on a screen (I have no idea why it added a third hand).

Once you build the model, you can feed it pictures of cats and dogs and see the results. My first experiment was to provide the same 25,000 photos I trained the model on to see how well it could do. The AI was about 60% accurate.

To remind you, the model is not storing copies of the photos. Instead, the AI looks for patterns (and relationships to patterns, but that is off-topic for this article). That is why I can feed my AI a new picture it has never seen before and take a guess at what it is seeing.

As you can see in this somewhat fuzzy image, the AI thinks this dog is a cat:

A picture of a dog. Although the AI thought it was a cat.
A picture of a dog. But, as you can see at the bottom, my AI thought it was a cat. The AI was only about 60% accurate on my first try.

I also fed the AI a picture of a cat, and guess what? It recognized it as a cat!

A picture of a cat along with the AI's estimate that it is indeed a cat.
A picture of a cat. At the bottom, you can see that the AI correctly identified it as a cat!

What I learned: the critical role of data scientists and experienced software engineers

An AI-generated image of a man sitting in front of a laptop to learn. Once again, there's another third arm!
leonardo.ai: My attempt at generating someone that looks like me (big fail, BTW): "Adult white male wearing a green tweed flat cap and glasses, sitting in front of a laptop, looking at the screen intently, vibrantly colored indoor lighting, detailed, digital painting, cinematic, hyperrealistic." That mysterious third hand makes yet another appearance!

Let's say you and I are building two competing AI systems. We have the same training samples of 1 million photos of cats and dogs.

I hire some developers to write code using known practices found on the web. Conversely, you hire data scientists and highly skilled software engineers who work tirelessly to fine-tune the algorithms and methods to predict whether a photo is a cat or a dog.

Because you spent the time (and presumably more money) to be an expert on your data, your model will most likely win in any head-to-head competition. Because you hired the team for the long term, they will work tirelessly to fine-tune and improve your model over time. Each improvement can mean more value to your customers.

Look at the following images that generated a picture of a dog, using the same prompt: "A dog, hyperrealistic.":

A quadrant of four AI-generated images taken from popular AI image generation tools: Stable Diffusion, CrAIyon, Leonardo.ai, and DALL-E.
Images from four popular AI-generation tools using the prompt: "A dog, hyperrealistic"

I am not trying to judge which image is better or worse. Nor am I suggesting that one product is better than another. I am also not trying to suggest that there are better or worse engineers or data scientists on one team over another.

However, as a consumer of AI products, there are some I prefer to use -- and am willing to pay for -- over others.

You could use some generalized, pre-defined AI tools for some situations. However, to build AI into your core product, you should think long and hard about your objectives, whom you hire, their skill sets, and what value you want to receive from your budget.

AI and your teams

An AI-generated image of smiling team members.
leonardo.ai: My "Flat avatar people work very topology face, smiling"

When discussing software products and agility, we use terms like DevOps or DevSecOps, where we tightly couple small software development, quality assurance, operations, and security people.

To keep teams small and efficient, you might consciously or unconsciously exclude certain people from the teams. For example, content editors that write documentation, user interface designers, and others are often treated as shared resources across teams. Unless carefully managed, that separation from a specific team can easily lead to sub-par user experiences and consistency issues in product design.

We are in a golden age of digital transformation. To stay ahead, we not only need great products, but we also need to embrace what is new. It is time to consider the importance of AI expertise in your teams.

That expertise can come in many forms:

  • Data scientists who will unlock new ways to use data, train and fine-tune models, and collaborate with software engineers to integrate AI into the product
  • Software engineers create performant software code that incorporates the algorithms and models defined by the data scientists.
  • Visionaries that can re-think your product in ways that give you a highly competitive leap ahead

Of course, I have more thoughts on that and will fine-tune them as I learn more in this AI journey, but I hope you learned a little something along the way.

要查看或添加评论,请登录

Bill Raymond的更多文章

社区洞察

其他会员也浏览了