From Abstract Distances to Concrete Spaces: The Power of Multidimensional Scaling

From Abstract Distances to Concrete Spaces: The Power of Multidimensional Scaling

Imagine you're at a grand reunion, tasked with arranging group photos. You know who's friends with whom, but not where they should stand. This scenario is akin to having a set of points (your friends) and knowing the distances between them, but not their exact positions. This is where the magic of Multidimensional Scaling (MDS) comes into play. It's like an astute organizer who arranges everyone perfectly, based on the nuances of their relationships.

The Mathematical Puzzle Behind MDS

MDS starts with a distance matrix, D, a symmetric n×n matrix where D[i][j] represents the distance between points vi and vj. But how do we find their relative positions in space? Let's dive into the mathematics that is the backbone of MDS:

  • Squaring the Distance Matrix: We begin by squaring the distance matrix D, which is akin to zooming in on the finer details of our relationships.

Matrix D'

  • Centering the Matrix: We use a centering matrix, H, to transform D' into matrix B=?1/2HD′H. This step is like adjusting our lens to get a clear picture.

  • The Gram Matrix: B is similar to the Gram matrix, which is essentially a matrix of dot products of vectors representing our points. This similarity is crucial as it links our abstract distances to concrete positions

Now, let's delve deeper into the transformation process. The realization of these points in Euclidean space hinges on whether the matrix B is positive semi-definite. This property of B ensures that we can represent our n points in Euclidean space in a way that their distances align with the matrix D.

The Gram matrix G' enters the scene as a matrix of dot products of vectors. For n vectors v1,v2,...,vn the Gram matrix is defined as:

Gram Matrix

Where G'[i][j] = v_i.v_j represents the dot product between vector v_i and v_j

Interestingly, here B = -1/2HD'H = G', showing the resemblance of B to the Gram matrix. If B is not positive semi-definite, then it yields an approximate embedding of our points.

The relationship between D′ and B can be described as follows:

This relationship lays the foundation for how we can retrieve the embedding for the n points of interest.

Turning Theory into Reality

So, how do we materialize the coordinates of these points? Here's a step-by-step guide:

  • Spectral Decomposition: Perform spectral decomposition on B, arranging the eigenvalues in descending order.

  • Define Y as

  • Extract the first k columns of Y to form matrix Y_k.
  • The embedding of the n points is then represented by the rows of Y_k.

Implementation

Let's apply MDS to a real scenario: we have 55 points in a 2D space. We know the distances between them but not their coordinates. We construct a distance matrix, D, and apply our MDS technique:

d_link = np.zeros([n_link,n_link])

for i in range(0,n_link):
    for j in range(0, n_link):
        d_link[i][j] = dist(coord[i], coord[j])        

The Classical MDS Function: This function takes our distance matrix and the target dimension and returns the coordinates of our points in the new space.

def classical_MDS(D, k):
    n = D.shape[0]

    # Centering matrix
    H = np.eye(n) - np.ones((n, n)) / n

    # Double centering
    B = -0.5 * H.dot(D**2).dot(H)

    # Eigen decomposition
    eigenvalues, eigenvectors = np.linalg.eigh(B)

    # Sort eigenvalues and eigenvectors in decreasing order
    idx = np.argsort(eigenvalues)[::-1]
    eigenvalues = eigenvalues[idx]
    eigenvectors = eigenvectors[:, idx]

    # Select the top k eigenvalues and eigenvectors
    L = np.diag(np.sqrt(eigenvalues[:k]))
    V = eigenvectors[:, :k]

    # Compute the coordinates using eigenvectors and eigenvalues
    Y = V.dot(L)

    return Y

X_link = classical_MDS(d_link, 2)        

Visualize the results,

import matplotlib.pyplot as plt

def plot_embedding(X):
    plt.figure(figsize=(10, 8))
    plt.plot(X[:, 0], X[:, 1], marker='o')

    plt.xlabel('Component 1')
    plt.ylabel('Component 2')
    plt.title('2D Embedding using Classical MDS')
    plt.show()

plot_embedding(X_new)        

Result:

2D embedding using Classical MDS

Original coordinates:

Original Coordinates

After applying Multidimensional Scaling to our dataset, an intriguing result emerged. The 2D embedding obtained from the MDS closely resembled the original relative positions of the points. Despite only knowing the distances between points and not their initial coordinates, the MDS process successfully reconstructed their relative positions in a 2D space.

Conclusion: The Art of Unveiling Hidden Structures

Multidimensional Scaling is more than a mathematical tool; it's a way to unveil hidden structures and relationships in data. Whether it's arranging people in a photo based on unseen social ties or analyzing complex scientific data, MDS provides a window into understanding the unseen.

Have you ever encountered a scenario where MDS could have been a game-changer? Let's discuss in the comments below!

KULBHUSHAN THORAT

Data Science enthusiast with skills in Machine Learning, Python, SQL and Visualization

1 å¹´

Good read

要查看或添加评论,请登录

Shubham Thorat的更多文章

社区洞察

其他会员也浏览了