ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

From Abstract Distances to Concrete Spaces: The Power of Multidimensional Scaling

Shubham Thorat

Software Engineer | PlayStation | UC San Diego

å‘å¸ƒæ—¥æœŸ: 2024å¹´2æœˆ26æ—¥

Imagine you're at a grand reunion, tasked with arranging group photos. You know who's friends with whom, but not where they should stand. This scenario is akin to having a set of points (your friends) and knowing the distances between them, but not their exact positions. This is where the magic of Multidimensional Scaling (MDS) comes into play. It's like an astute organizer who arranges everyone perfectly, based on the nuances of their relationships.

The Mathematical Puzzle Behind MDS

MDS starts with a distance matrix, D, a symmetric nÃ—n matrix where D[i][j] represents the distance between points vi and vj. But how do we find their relative positions in space? Let's dive into the mathematics that is the backbone of MDS:

Squaring the Distance Matrix: We begin by squaring the distance matrix D, which is akin to zooming in on the finer details of our relationships.

Centering the Matrix: We use a centering matrix, H, to transform D' into matrix B=?1/2HDâ€²H. This step is like adjusting our lens to get a clear picture.

The Gram Matrix: B is similar to the Gram matrix, which is essentially a matrix of dot products of vectors representing our points. This similarity is crucial as it links our abstract distances to concrete positions

Now, let's delve deeper into the transformation process. The realization of these points in Euclidean space hinges on whether the matrix B is positive semi-definite. This property of B ensures that we can represent our n points in Euclidean space in a way that their distances align with the matrix D.

The Gram matrix G' enters the scene as a matrix of dot products of vectors. For n vectors v1,v2,...,vn the Gram matrix is defined as:

Where G'[i][j] = v_i.v_j represents the dot product between vector v_i and v_j

Interestingly, here B = -1/2HD'H = G', showing the resemblance of B to the Gram matrix. If B is not positive semi-definite, then it yields an approximate embedding of our points.

The relationship between Dâ€² and B can be described as follows:

This relationship lays the foundation for how we can retrieve the embedding for the n points of interest.

Turning Theory into Reality

So, how do we materialize the coordinates of these points? Here's a step-by-step guide:

Spectral Decomposition: Perform spectral decomposition on B, arranging the eigenvalues in descending order.

é¢†è‹±æŽ¨è

Modeling Azure roles like Indiana Jones

Christophe Parisel 1 å¹´å‰

A Primer on Thin Plate Splines and Their Utility for Georeferencing Images

A Primer on Thin Plate Splines and Their Utility forâ€¦

Prof. David Kuria, MISK, LS (K) 1 ä¸ªæœˆå‰

Understanding Antenna Array Steering Vectors: A Practical Implementation

Understanding Antenna Array Steering Vectors: Aâ€¦

Dr. Mohammed Mobien (D.Sc.) 3 ä¸ªæœˆå‰

Define Y as

Extract the first k columns of Y to form matrix Y_k.
The embedding of the n points is then represented by the rows of Y_k.

Implementation

Let's apply MDS to a real scenario: we have 55 points in a 2D space. We know the distances between them but not their coordinates. We construct a distance matrix, D, and apply our MDS technique:

d_link = np.zeros([n_link,n_link])

for i in range(0,n_link):
    for j in range(0, n_link):
        d_link[i][j] = dist(coord[i], coord[j])

The Classical MDS Function: This function takes our distance matrix and the target dimension and returns the coordinates of our points in the new space.

def classical_MDS(D, k):
    n = D.shape[0]

    # Centering matrix
    H = np.eye(n) - np.ones((n, n)) / n

    # Double centering
    B = -0.5 * H.dot(D**2).dot(H)

    # Eigen decomposition
    eigenvalues, eigenvectors = np.linalg.eigh(B)

    # Sort eigenvalues and eigenvectors in decreasing order
    idx = np.argsort(eigenvalues)[::-1]
    eigenvalues = eigenvalues[idx]
    eigenvectors = eigenvectors[:, idx]

    # Select the top k eigenvalues and eigenvectors
    L = np.diag(np.sqrt(eigenvalues[:k]))
    V = eigenvectors[:, :k]

    # Compute the coordinates using eigenvectors and eigenvalues
    Y = V.dot(L)

    return Y

X_link = classical_MDS(d_link, 2)

Visualize the results,

import matplotlib.pyplot as plt

def plot_embedding(X):
    plt.figure(figsize=(10, 8))
    plt.plot(X[:, 0], X[:, 1], marker='o')

    plt.xlabel('Component 1')
    plt.ylabel('Component 2')
    plt.title('2D Embedding using Classical MDS')
    plt.show()

plot_embedding(X_new)

Result:

Original coordinates:

After applying Multidimensional Scaling to our dataset, an intriguing result emerged. The 2D embedding obtained from the MDS closely resembled the original relative positions of the points. Despite only knowing the distances between points and not their initial coordinates, the MDS process successfully reconstructed their relative positions in a 2D space.

Conclusion: The Art of Unveiling Hidden Structures

Multidimensional Scaling is more than a mathematical tool; it's a way to unveil hidden structures and relationships in data. Whether it's arranging people in a photo based on unseen social ties or analyzing complex scientific data, MDS provides a window into understanding the unseen.

Have you ever encountered a scenario where MDS could have been a game-changer? Let's discuss in the comments below!

KULBHUSHAN THORAT

Data Science enthusiast with skills in Machine Learning, Python, SQL and Visualization

1 å¹´

Good read

èµž

å›žå¤

1 æ¬¡å›žåº”

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Shubham Thoratçš„æ›´å¤šæ–‡ç«

Exploring TensorFlow: Computation Graphs, Optimizations, and Differentiation

2024å¹´4æœˆ15æ—¥

Exploring TensorFlow: Computation Graphs, Optimizations, and Differentiation

Introduction TensorFlow is an open-source software library for numerical computation using dataflow graphs. In simplerâ€¦

4 æ¡è¯„è®º
Counting Down to Victory: Unraveling Strategy with Dynamic Programming

2024å¹´4æœˆ1æ—¥

Counting Down to Victory: Unraveling Strategy with Dynamic Programming

Hey everyone! Recently, I found myself reminiscing about a super fun game I played a few years back when I was headingâ€¦
Optimizing Mentor-Mentee Pairings: A Deep Dive into the Max Flow Problem

2024å¹´3æœˆ29æ—¥

Optimizing Mentor-Mentee Pairings: A Deep Dive into the Max Flow Problem

Introduction: In the world of mentorship programs, aligning mentors' expertise with mentees' aspirations is a complexâ€¦

From Abstract Distances to Concrete Spaces: The Power of Multidimensional Scaling

Shubham Thorat

Software Engineer | PlayStation | UC San Diego

The Mathematical Puzzle Behind MDS

Turning Theory into Reality

é¢†è‹±æŽ¨è

Implementation

Conclusion: The Art of Unveiling Hidden Structures

Shubham Thoratçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Variography: Post II - The Nugget Effect

Battle of the Plots: Contour vs. Isosurface! ????

What Are Vector Spaces?

Navigating the Curves: The World of Riemannian Manifolds ????

State Space Models!!!

Solve issues with nonlinear analysis convergence!

LINEAR ALGEB

In Defense of Simulations

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Diffusion Maps

The Mathematical Puzzle Behind MDS

Turning Theory into Reality

é¢†è‹±æŽ¨è

Implementation

Conclusion: The Art of Unveiling Hidden Structures

Shubham Thoratçš„æ›´å¤šæ–‡ç«

Exploring TensorFlow: Computation Graphs, Optimizations, and Differentiation

Counting Down to Victory: Unraveling Strategy with Dynamic Programming

Optimizing Mentor-Mentee Pairings: A Deep Dive into the Max Flow Problem

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Variography: Post II - The Nugget Effect

Battle of the Plots: Contour vs. Isosurface! ????

What Are Vector Spaces?

Navigating the Curves: The World of Riemannian Manifolds ????

State Space Models!!!

Solve issues with nonlinear analysis convergence!

LINEAR ALGEB

In Defense of Simulations

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Diffusion Maps

é¢†è‹±æŽ¨è

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†