Quick Intuition for Understanding GenAI by Thinking in Dimensionality
Introduction: How to Learn AI for Non-Data Scientists - Bottom-Up or Top-Down?
I recently saw a short video. The gist went something like this:
If you're new to AI and want to learn about it, you've likely encountered two learning approaches:
This resonated with me. Looking back, I did a bit of both. Like most of us, I don’t have a Ph.D. in applied math or information theory. I spent a couple of years flunking through physics in undergrad before switching to English, though I did take courses like multivariable calculus and differential equations. But that was decades ago, and I haven’t touched those books since. So, I’m no math whiz. For me, the top-down approach makes a lot more sense—and it's far less frustrating.
Learning AI today is a bit like learning to drive a car in the early 20th century. Back then, driving was a highly technical job, as drivers had to know how clutches and carburetors worked together (and there were no automatic transmissions). However, they didn’t need to be engineers, because cars were reliable enough for practical use. Similarly, today’s AI tools don’t require you to be a mathematician—you just need to know enough to get things working.
Take, for example, the seminal 2018 paper, Attention Is All You Need. It wouldn’t have been possible without earlier mathematicians like:
Learning all of these foundational theories would take years. But if you focus on key areas that matter to you, gaining a practical understanding is entirely doable.
Personally, I’ve got a decent intuition for embeddings, but the key/query/value part was challenging—so I spent more time reviewing vector spaces and inner products. On the other hand, the Fourier transform is complex, but I consider it less critical because PyTorch handles it for you.
Here are some quick tips that worked for me:
In this article, we’ll focus on dimensionality.
Dimensionality in Everyday Life
Remember high school algebra? The x, y, and z axes (1st, 2nd, and 3rd dimensions), and maybe the 4th dimension—popularized by shows like Star Trek and movies like Back to the Future. But beyond that, what do dimensions have to do with everyday life?
Here are two examples:
Example 1: Imagine flat images of disks—some are perfect circles, others are ellipses. In just two dimensions (x and y), it’s hard to interpret them. But now, if we add the third dimension (z-axis) and imagine these disks are saucers rotating in space, the shapes on a flat surface are just shadows. Suddenly, everything becomes clearer. If we need to calculate how they expand and shrink, it's as simple as using sine/cosine equations.
Key takeaway: Adding a dimension often simplifies complex problems.
Example 2: German philosopher Hegel is well known for his concept of dialectics, a method of solving problems by examining opposing viewpoints (thesis and antithesis) and synthesizing them into a new, refined idea. You may have heard of it, though Marx and Nietzsche are more widely read in the U.S., despite Hegel and Kant having had a greater impact on modern science.
Within Hegel’s dialectics, there’s a concept called sublation (Aufhebung), which means both to preserve and to negate at the same time. Hegel uses the growth of a plant to explain this in Phenomenology of Spirit:
"The bud disappears when the blossom breaks through... in the same way, when the fruit comes, the blossom may be explained to be a false form of the plant’s existence, for the fruit appears as the truth in place of the blossom."
In this process:
Dimensionality Explained: Sublation can be thought of as "lifting" to a higher dimension, where a system integrates previous stages. For instance, human babies replace over 90% of their cells within a year or two, yet they are still considered the same person. Through this, we comprehend growth and change—similar to how AI models "transcend" different concepts by adding or reducing dimensions as they refine predictions.
Vectors & Matrices in GenAI
Vectors: A vector is a list of numbers, but you can think of it as representing both magnitude and direction in a complex space.
Matrices: A matrix is a collection of vectors that forms a shape in space, which can be transformed by:
领英推荐
What Is Word Embedding?
Simply put, embedding converts a word into a series of numbers (a vector). Unlike basic encoding (which turns text into binary), embedding is far more data intensive.
For example, in standard encoding, "artificial intelligence" is represented as a string of numbers based on the ASCII values of its characters. But in embedding, each word is represented by a vector with hundreds or thousands of dimensions, packing much more information. Different models pack embeddings differently, with varying dimensional sizes.
Here are some examples:
What Is Attention?
The attention mechanism allows models to focus on the most relevant parts of an input sequence. Introduced in the paper Attention Is All You Need, this concept is central to modern transformer models like GPT-3.
The equation is:
Attention(Q, K, V)=softmax(Q(transpose(K)/dk)V
In the end, you get a matrix that combines the meaning of each word with its position in the sentence.
The softmax function normalizes these comparisons into probabilities, allowing the model to focus more on important words and less on irrelevant ones.
Finally, there’s multi-head attention, which splits the process into multiple “heads,” each focusing on different parts of the input. For example, if your initial embedding size is 512, each head will process 64 dimensions. By combining these perspectives, the model gets a more complete understanding of the input, much like the five blind men describing different parts of an elephant.
Summary of the Process:
For each head:
Finally, all 8 heads are combined and passed through the decoder to generate the final result.
Compared to earlier approaches, this process involves iterations of high-dimensional computations. The higher the dimensions, the more complexity it can handle. For now, large language models (LLMs) speed this up using GPUs, which can efficiently handle matrix operations.
Ultimately, if singularity ever happens, it will likely be in the quantum computing age - when computers can dial up and down to hundreds of thousands or even millions of dimensions.
Conclusion: Dimensionality Isn’t Intimidating – We Use It Daily!
Dimensionality is everywhere—in how we understand growth, relationships, and learning. Just as adding an extra dimension makes complex problems easier to solve, modern AI models use higher dimensions to refine their predictions. AI is constantly integrating information across dimensions to understand our world. And as daunting as these concepts might seem at first glance, we interact with them intuitively in everyday life.
So, while you don’t need to be an expert in Hilbert spaces or Fourier transformations, developing a good intuition about dimensionality can take you a long way in understanding and working with AI.
?
Ready for the real estate revolution? ?? | AI-driven bargains at your fingertips | Proptech Expert | My Exit with 33 years and the startup comeback. ???????
5 个月ai can be a maze, but diving into dimensionality empowers understanding. what part of ai sparks your curiosity most? Xiao-Fei Zhang