Python Data File Formats – How to Read CSV, JSON, and XLS Files
Malini Shukla
Senior Data Scientist || Hiring || 6M+ impressions || Trainer || Top Data Scientist || Speaker || Top content creator on LinkedIn || Tech Evangelist
Python Data File Formats
Let’s first learn a little about the Python Data File formats we will be processing with.
a. Python Data File Formats – Python CSV
Python CSV data is a basic with data science. A Comma-Separated-Value file uses commas to separate values. You can look at it as a delimited text file that holds tabular data as plain text. One problem with this may arise when the data it holds contains a comma or a line break- we can use other delimiters like a tab stop. This Python data file format proves useful in exchanging data and in moving tabular data between programs. The extension for a CSV file is .csv.
Follow this link to know about File Handling In Python
Here’s a Python CSV file we will use for our demo-
id,title,timing,genre,rating
1,Dog with a Blog,17:30-18:00,Comedy,4.7
2,Liv and Maddie,18:00-18:30,Comedy,6.3
3,Girl Meets World,18:30-19:00,Comedy,7.2
4,KC Undercover,19:00-19:30,Comedy,6.1
5,Austin and Ally,19:30-20:00,Comedy,6
We saved this as schedule.csv on our Desktop. Remember to save as All files (*.*). When we open this file, it opens in Microsoft Excel by default on Windows-
b. Python Data File Formats – Python JSON
JSON stands for JavaScript Object Notation and is an open standard file format. While it holds attribute-value pairs and array data types, it uses human-readable text for this. This Python data file format is language-independent and we can use it in asynchronous browser-server communication. The extension for a Python JSON file is .json.
Python Data File Formats – JSON
Let’s Explore Python Zipfile – Benefits, Modules, Objects
Here’s the JSON file in Python we will use for the demo-
- {
- "ID":["1","2","3","4","5"],
- "Title":["Dog with a Blog","Liv and Maddie","Girl Meets World","KC Undercover","Austin and Ally"],
- "Timing":["17:30-18:00","18:00-18:30","18:30-19:00","19:00-19:30","19:30-20:00"],
- "Genre":["Comedy","Comedy","Comedy","Comedy","Comedy"],
- "Rating":["4.7","6.3","7.2","6.1","6"]
- }
We save this as schedule.json on the Desktop.
c. Python Data File Formats – Python XLS
The extension for an Excel spreadsheet is .xlsx. This proves useful for data science; we create a workbook with two sheets in Microsoft Excel.
Sheet 1-
Sheet 2-
We save this workbook as schedule.xlsx on our Desktop.
Do you Know the XML Processing in Python 3
Prerequisites
To process these Python data file formats, we need the library pandas.
Install it using pip-
- >>> pip install pandas
How to Read CSV File in Python
To read an entire file, rows, columns, or combinations of those, read on.
a. Reading an entire Python CSV File
To read an entire file, we can use the read_csv() function.
- >>> import pandas
- >>> import os
- >>> os.chdir('C:\\Users\\lifei\\Desktop')
- >>> print(pandas.read_csv('schedule.csv'))
See Also-
- Python – Interview Questions Part 1
- Python – Interview Questions Part 2
- Python – Interview Questions Part 3