Gravity is not data.
"Gravity" is a song by John Mayer, featured on three different releases.
It is a narcissistic representation of love and connection.
And a lyrical exploration of the attraction between two individuals, poisoned by longing, desire, and vulnerability.
Unlike Newton's gravity which is a physical force that can be measured and calculated, Mayer's is an inert, human force, shaped by personal experiences and failures.
It starts like this:
[Chorus]
Gravity is working against me.
And gravity wants to bring me down.
[Verse 1]
Oh, I'll never know.
What makes this man, with all the love that his heart can stand
Dream of ways to throw it all away.
So Mayer's version is close to a form of poetics, above all, far from Newton's law.
Newton's own words from??Philosophi? Naturalis Principia Mathematica (1):
"Every particle of matter in the universe attracts every other particle with a force that is directly proportional to the product of their masses and inversely proportional to the square of the distance between their centers."
The gravitational force near the Earth's surface is simplified as F=mg where m is the mass of the object, and g is the acceleration due to gravity, approximately 9.81?m/s.
If we had enough data, we could infer this straight from itself. We could ask someone to do some experiments and write down the results.
And since we're in the ML era according to Chris Anderson's article, "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete," (2) with enough data, the numbers speak for themselves.
He argued that the increasing availability of vast amounts of data renders traditional scientific methods, such as hypothesis testing and theory building, unnecessary.
He claimed that with enough data, patterns would naturally emerge, enabling direct correlation and prediction without needing underlying explanations.
Let's see if we can reach this insight, f=m*g, with some training data.
We start with a sequential model with a neural networks using Keras, a high-level API for TensorFlow.
model = models.Sequential([
layers.Input(shape=(1,)), # Input layer
layers.Dense(64, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(1) # Output layer for the force
])
Since we don't know anything on a physical level, as an experimental interpretation, aside from the input and output layer, we use a Dense layer with 64 neurons, where each neuron (or unit) is connected to every neuron in the previous layer.
The activation function used in this layer is 'relu' which stands for Rectified Linear Unit.
This means that:
- if the input x is positive, the output is x;
- if x is negative, the output is 0.
ReLU helps to introduce non-linearity into the model.
Then, we make our prediction based on the data:
predictions = model.predict(m_test)
Comparing the predicted data with the expected values the result is not bad.
But when we arrive to ask the model created what is the equation it came up with, we have a surprise.
# Extract the learned weights and biases
weights, biases = model.layers[0].get_weights()
# Extract the learned gravitational constant (g) and bias
g_learned = weights[0][0]
b = biases[0]
# Print the learned equation
print(f"Learned equation: F = {g_learned} * m + {b}")
F = 0.13229107856750488 * m + -0.026830332353711128
That is quite far from F = 9.81*m + 0, Newton's law of gravitation on earth.
But the data in itself is not wrong.
To find the scaling factor between the two equations, we divide the expected coefficient 9.89 by the learned coefficient 0.132291078567504880.
Scaling Factor = 0.13229107856750488 / 9.8 ≈74.07407
Hence we can multiply the learned equation by this factor to bring it in line with the expected equation:
领英推è
F = 74.07407 × (0.13229107856750488 × m ? 0.026830332353711128)
And simplify:
F ≈ 9.8 × m ? 1.98769151
This is very close to the expected equation F=9.8×m+0, with only a small difference in the intercept term, due to numerical precision or a minor offset learned by the model.
But does it also have a physical meaning, given that it is a special case of Newton's gravitational law?
Of course not.
Already something smells fishy. Is it true that the availability of vast amounts of data renders traditional scientific methods unnecessary?
So, we go back to the model and try to understand if we were overfitting the model.
model = models.Sequential([
layers.Input(shape=(1,)),
layers.Dense(1, activation='linear')
])
The model now has 1 neuron because we want the output of our model to be a single value, such as in regression tasks where you’re predicting a continuous value, such as predicting prices, quantities, or any other numerical value.
Then we train with a smaller batch size.
model.fit(m_train, F_train, epochs=100, batch_size=8)
Comparing the predicted data with the expected values the result is similar as before...
but now if we ask what is the learned equation we have:
Learned equation: F = 9.800000190734863 * m + 0.0
the expected equation based on the physical relationship between force and mass under gravity (assuming g≈9.8?m/s^2).
And here comes the "No Free Lunch" (NFL) theorem that says there is no one-size-fits-all algorithm for solving all problems.
Any optimization algorithm will perform no better than any random guessing.
Given a uniform distribution of problems, no algorithm is universally better than another.
In this particular case, we have shown that data will not convey any particular meaning aside from the specific model.
And, in some sense, the model itself is just an educated guess.