Data Visualization: Simplifying Complexity at a Glance
Smita Vatgal
Engineer Golang/Python | Microservices | DevOps | AWS | Kubernetes | CICD | Automation
A chart can tell a story that rows of numbers can’t. That’s the power of data visualization!
For example below is the screenshot of data in csv format
Raw numbers can feel overwhelming, and spotting patterns is tough.
Now lets see the same data in visualization..
With just one glance, we get the message that Python is on the rise. That’s how visualization transforms data into insight.
Data is crucial for decision-making because it provides insights, evidence, and clarity, helping individuals and organizations make informed choices. And data visualization makes those insights simpler to understand!
How do we turn data into beautiful, insightful charts?
We can transform raw data into stunning visualizations using Python libraries that are designed for plotting and analysis. Here are some of the most popular ones:
Matplotlib – The Foundation of Python Visualization
Seaborn – Beautiful Statistical Plots with Ease
Pandas Plot – Quick and Easy Plots from DataFrames
How to get source data for visualization?
To create meaningful visualizations, you need quality data. Here are some reliable sources and methods to obtain datasets:
Code for above visualization
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load the dataset
df = pd.read_csv('data_prog_lang.csv')
# Clean column names
df.columns = df.columns.str.strip()
# Convert 'Date' from 'Jan-05' to datetime (month-year)
df['Date'] = pd.to_datetime(df['Date'], format='%b-%y', errors='coerce')
# Print to verify conversion
print(df[['Date']].head())
# Melt dataframe (wide to long format)
df_long = df.melt(id_vars=['Date'],
var_name='Language', value_name='Popularity')
# Drop missing values
df_long = df_long.dropna()
# Find top 8 languages by average popularity
top_languages = (
df_long.groupby('Language')['Popularity']
.mean()
.sort_values(ascending=False)
.head(8)
.index
)
df_filtered = df_long[df_long['Language'].isin(top_languages)]
# Plot using seaborn
plt.figure(figsize=(14, 7))
sns.lineplot(
data=df_filtered,
x='Date',
y='Popularity',
hue='Language',
marker='o',
palette='tab10'
)
plt.title('Programming Language Trends Over Time', fontsize=16)
plt.xlabel('Date (Month-Year)', fontsize=12)
plt.ylabel('Popularity (%)', fontsize=12)
plt.grid(True)
plt.xticks(rotation=45)
plt.legend(title='Language', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
# Show the plot
plt.show()
Business Analyst
2 周The best explanation on data visualization .. ??