In Part I of this series we imagined a world where AI tools exist to create images, text, code and even strategy. It's a world we are already in and it is accelerating faster than almost any software innovation to date. Unfortunately, in my experience working with enterprise companies, the adoption of AI tools is not keeping up. I strongly suspect it's due to the overwhelming amount of information that is not built in to existing workflows.
It's my opinion that the best way to keep up with what is possible is to start with understanding the very basics of what makes AI possible. Consider this the bare minimum in terms that are necessary for a designer or design leader to know. (note: this is intentionally over simplified)
- Machine Learning (ML) - is the math that makes AI possible. ML has been around a long while, but the computing power to apply machine learning to billions and trillions of data points is fairly new. The other thing to know is that all of the structured data in social media allowed the major advances in the last two years.
- Prompts - Simply put, this is how you 'prompt' or ask an AI for something. It is usually text input in a conversational format. Prompt engineering is the term that is applied to the science of writing prompts that get AI to give the best results
- Stable Diffusion - this is what makes text to image generation possible. There is also a sub-category called 'style-transfer' which makes it possible to recreate an image in the style of another. Most people probably associate Midjourney or Adobe Firefly to this today.
- Large Language Models (LLMs) - if machine learning is math, large language models are what uses that math to predict the next most logical word in a response to a prompt. It's important to know that there are lots of 'models' being trained or 'fine-tuned' for different purposes. LLMs are the intellectual property of companies. Think of the each companies model similar to Google vs Microsoft's search results. The underlying data and machine learning (math) determines the quality of the results.
- Generative Artificial Intelligence (GenAI) - is all the rage right now. LLMs are the data behind GenAI. This is what we know as ChatGPT, Claude, Perplexity and many others. All of the GenAI companies provide APIs, or code to build your own products using their technology. The product they are selling is referred to as 'inference'. In other words, it's a technology that uses their LLM to 'infer' the meaning of the prompt and predict, word by word, the best response to it.
- Retrieval Augmented Generation (RAG) - this might be the most important term for designers to understand. LLMs, and GenAI as a result, are trained on MASSIVE amounts of data - but they aren't trained on YOUR data. For creative designers it's most likely that you are going to want an AI that mimicks you. Whether it's how you speak and write or your visual style, you probably want a partner that can help you - not compete with you. A 'RAG' is basically a library of your own data that you tell the AI to base it's predictions on. It's the best of both worlds. The massive data in the LLM makes the AI 'smart', but the selective data in a RAG makes it personal. (note: RAGs focus on text. There are other terms for how this applies to images, but don't worry about it unless you want to. It's enough to know that AIs can be focused on what you want)
Was that a lot? It felt like a lot, but it's important to know before Part III where we will talk about the roles that AI can play. Things like an Assistant, Strategist, and Creative partner. Knowing these terms will help you know what to use in each scenario.
Thank you. Please read Part I, and Part III if you haven't already. Part IV is coming soon and will detail some of the AI tools and likely AI features UX can use.
If you want to be sure to see future installments, follow me on LinkedIn
Chad Vavra, excellent article and very timely as many begin their AI journey. It all begins with the user and their interactions with the hope of having a positive experience. All too often people in this domain focus on the technology versus the human element.