Understanding Different Types of File Formats

Understanding Different Types of File Formats

Standard file formats:

1.????? Delimited text file formats or .CSV: files used to store data as text in which each value is separated by a delimiter (a sequence of one or more characters for specifying the boundary between independent entities or values). The most common delimiters are: comma, Tab, Colon, Vertical Bar and Space. Being the most commonly used either CSV or TSV.

When literal commas are being used within the text and, therefore, cannot be used as a delimiter there is another option (or servs an alternative) like .TSV file formats.

Here is an example of a CSV file, where each row or horizontal line in the text file has a set of values separated by a delimiter and represents a record. The first row works as a column header where each column can have a different type of data (string, date, integer). Delimited files allow field values of any length and are considered standard format for providing straightforward information schema. They can be processed by almost all existing applications. Delimiters also represent one of various means to specify boundaries in a data stream.


2.????? Microsoft Excel Open .XML Spreadsheet or .XLSX

?Is a Microsoft Excel Open XML file format that falls under the spreadsheet file format. It is an XML-based file format created by Microsoft. In a .XLSX, also known as workbook, there can be multiple worksheets (each of them organized into rows and columns), at the intersection of which is the cell. Each cell contains data.


3.????? Extensible Markup Language, or .XML

?

o?? Is a markup language with set of rules for encoding data.

o?? Is readable by both humans and machines.

o?? It is a self-descriptive language design for sending information over the Internet.

o?? Similar to .HTML in some respects, but also has differences. It does not use predefined tags like .HTML does.

o?? Platform independent.

o?? Programming language independent

o?? Makes it simpler to share data between systems.

?4.????? Portable Document Format, or .PDF

Is a file format developed by Adobe to present documents independent of application software, hardware, and operating systems. It can be viewed the same way on any device. It is frequently used in legal and financial documents and can also be used to fill in data in data forms.



5.????? JavaScript Object Notation, or .JSON

?

Is a text-based open standard designed for transmitting structured data over the web. The language format is a language-independent data format that can be read in any programming language. It is easy to use, compatible with a wide range of browsers and it is considered one of the best tools for sharing data of any size and type even audio and video.



For more information you could either ask me or check Data Science Certification of IBM trough Coursera.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了