LLM And Hallucinations

LLM And Hallucinations

I'm a bit surprised that the term hallucinations have been used to describe the output of LLMs when they're performing out of sample predictions. The use of the word hallucinations to describe this phenomena obfuscates what really happens within these neural networks.

Conceptually, a LLM is an auto-encoder. To be more specific they're a type of Variational Autoencoder (VAE). Essentially an autoencoder is a network that finds a lower dimensional representation of some input dataset, which in this case is a corpus of text or a series of images that can be used to recreate the original dataset with some level of fidelity. Generally it is hoped that this lower dimensional representation is something that is a bit more amenable to different tasks that we have downstream.

One of the interesting things about using a neural network, and more specifically an autoencoder, is that you can obtain a lower dimensional representation of your data, and then use the decoder portion of your network to reconstruct a reasonable facsimile of the what the original data is. In some ways this can be thought of as a lossy compression technique.

The second, more important part that gives LLMs a flavor of "magic" is taking the autoencoder, and then applying Reinforcement Learning with Human Feedback (RLHF), which allows this embedding to be adjusted so that similar topics can appear together. This is what allows code generation to work. They take two sets of text that under a naive embedding are not associated with each other and tweak the network so that they will be.

The reason that I believe that the hallucination problem is "baked" into the LLM is because of the interplay of a few reasons.

  1. The autoencoder can generate an output given an arbitrary input
  2. We don't know where there is sufficient amounts of data available
  3. Decision boundaries of neural networks are good at separating classes, but don't tell you where the decision boundaries end.

To illustrate this, I trained a neural network to predict the digits from 0-9 from the mnist dataset. This model was > 99% accurate on the dataset on the MNIST dataset with a loss < 6.7x10-5, so at least in its original task it's as accurate as the data allows.

epochs = 50
batch_size = 32

input_layer = Input(shape=(25,))
x = Dense(30, activation="elu")(input_layer)
x = Dense(20, activation="elu")(x)
output_layer = Dense(10, activation='softmax')(x)
prediction_model = Model(input_layer, output_layer)
prediction_model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])
prediction_model.summary()

x_train_embedding = embedding.predict(x_train)
x_test_embedding = embedding.predict(x_test)

prediction_model.fit(x_train_embedding, y_train, batch_size = batch_size, epochs = epochs, 
          validation_data = (x_test_embedding, y_test))        

Rather than using an autoencoder I just played with the input. I tried to find the pixels that would give the highest probability to being predicted as the digit "7". I did this by using a pass through layer that fed 1's into the network, and optimizing the weights of this pass through layer, to maximize the probability. (So the following image will predict a "7" when fed through the original network). By doing so, the "weights" then become the image that becomesThis looks nothing like a 7, but when fed through the original MNIST classifier shown above will return with very high probability that it is a 7 and nothing else.

According to my classifier above this image is a 7


To better understand why this is the case, we can plot the output of the embedding layer. The test point while it isn't widely off in the middle of nowhere, lies close to the decision boundary, but is also kind of far from the "central" cluster of 7's or any numbers in fact. But that doesn't mean that data can or "can't" lie there.

Embedding of the MINST test set + the test point that classified as a 7


This indirectly suggests why Retrieval Augmented Generation (RAG) works so well. Because rather than just generating a solution that works for your initial network, you can anchor the points near a document of "real data" instead of taking any arbitrary point in your space. In my example, the test point, though it is within a classification boundary and will get you a 7 is far from the central tendency of all the points that are a 7.

RAG, or finding the closet point in your embedding that gives you a 7 is one way of dealing with this problem, but other ways can include hacking a gaussian mixture model on the embedding layer so you can define a boundary for your known data, and/or putting in random garbage data that is not supposed to classify to anything.

However, the pernicious part of the problem is that the boundaries within a neural network need not be convex, and so you don't know at what points the decision boundaries break down. So while the proposed problems above might make the problem less apparent, it doesn't fully solve the issue. It's fully possible that your underlying embedding looks like a Swiss roll, and a secondary point that is arbitrarily close might have something that represents a different topic.

Now this realization doesn't mean that I don't believe that LLM hype. I do think that they represent a significant leap forward in our ability to do NLP and as such are incredibly useful tools. On the flip side thinking through the problem I do have concerns about the training data that goes into these LLMs, and maybe rather than focusing on ever bigger models, the competitive advantage for the different models will come down to who has curated their data better.

要查看或添加评论,请登录

Eric Yang的更多文章

  • Speed-running the lifecycle of a data science project – LLM Edition (Part 1)

    Speed-running the lifecycle of a data science project – LLM Edition (Part 1)

    Recently I had bemoaned the fact that for an open position, I was inundated with candidates, more than 500 in two days,…

    3 条评论
  • Sklearn-Pipelines are awesome

    Sklearn-Pipelines are awesome

    One of the things I see in a lot of data science code is an under-reliance on object-oriented programming. Many times…

    2 条评论
  • From Junior to Senior and Beyond in Data Science (IC) Track

    From Junior to Senior and Beyond in Data Science (IC) Track

    One of the questions that will invariably pop up in your career is "What do I need to do to get promoted?" One of the…

    3 条评论
  • Having an Impact in Data Science Part II

    Having an Impact in Data Science Part II

    In Part I we focused on asking the question of how actions would change based on the results of the analysis…

    2 条评论
  • Having an Impact in Data Science (Part 1)

    Having an Impact in Data Science (Part 1)

    One of the biggest factors that lead to burnout for data scientists is feeling that their analyses are not making a…

    2 条评论
  • On Being a Manager

    On Being a Manager

    I've been pretty lucky in my career to have had pretty good managers overall, and I think that a big part of one's…

    1 条评论
  • On Grit

    On Grit

    I had a disagreement with my wife about Angela Duckworth’s book Grit: The Power of Passion and Perseverance, on purely…

  • Google's Kodak Moment

    Google's Kodak Moment

    The one thing that I had hoped that businesses would learn is that you shouldn't be put out of business by something…

    2 条评论
  • Finding the Optimum of a Unknown Function with Neural Networks

    Finding the Optimum of a Unknown Function with Neural Networks

    One of the desires that I had since graduate school, was to take a model that came from various ML methods and to be…

    1 条评论
  • Lasso Attention Layer For Neural Networks

    Lasso Attention Layer For Neural Networks

    For me, the holy grail of ML methods is one that can Properly predict an outcome Identify the relevant input features…

社区洞察

其他会员也浏览了