Reading .csv files with and without headers using different delimiters in Mule 4
Photo by fabio on Unsplash

Reading .csv files with and without headers using different delimiters in Mule 4

Flat files especially .csv files are very common for data loading and data transfers. We generally come across scenarios where we want to read a .csv file, which may or may not contain the header row. Also, there is a chance .csv files don't have ','(comma) as delimiter rather has ''|"(pipe) as a delimiter or any other character as a delimiter. This article will discuss various ways to read such files in Mule 4 and create a JSON output of it.

Mule Flow

This will be a simple flow, where we will be reading a file using a file connector. Then convert the read data to JSON using transform message. Finally log the JSON payload to console using Logger. Here, we are not doing any complex transformation, it will be more like a MIME type conversion from application/csv to application/json, for better understanding the structure of input data.

No alt text provided for this image

Steps:

  • Create a flow in your mule application.
  • Drag and drop file connector with operation "On New or Updated file" on the source of mule flow.
  • In file connector properties under the general tab, specify or create a new connector configuration. Provide input directory, polling frequency(here it is kept as 10 seconds), specify the post-processing action like deleting the file, moving the file, or renaming the file. As per the below configurations, the mule application will look for a file in the inbound folder every 10 seconds. If it finds any file, it will process and move the file to the archive directory deleting the file from the inbound location.

No alt text provided for this image

  • Drag and drop "Transform Message" from mule palette, and have output mime type as "application/json" as below.

No alt text provided for this image

  • Drag and drop logger from mule palette. Here we will be logging payload after transformation.
  • The final mule flow will look as below.

No alt text provided for this image

.csv file having header row in the file content

Let us take below employeeData.csv file as an example. For simplicity, we are taking dummy data of only 5 employees.

id,first name,last name,email,gender,designation
1,Dom,Lissett,[email protected],Male,Recruiter
2,Susan,Moogan,[email protected],Female,Executive Secretary
3,Carl,Cotesford,[email protected],Male,Tech Architect
4,Debor,Sprakes,[email protected],Female,Account Coordinator
5,Priscella,Genny,[email protected],Female,Environmental Tech
        

In this example, data in the first row signifies the headers separated by "," as a delimiter. Rest all other rows represent employee data.

  • Since the above .csv file contains a header row, all we need to do is to set the MIME type of file connector as "application/csv". To do so, click on the file connector. Click on MIME Type and select "application/csv" from the drop-down list of values.

No alt text provided for this image

  • Place the file as defined in the file connector. Run the application. We will see the JSON response logged in the console as shown below.

INFO? 2021-07-27 23:08:55,916 [[MuleRuntime].uber.03: [csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor: csv-readFlow/processors/1; event: 895d1200-efb1-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "id": "1",
? ? "first name": "Dom",
? ? "last name": "Lissett",
? ? "email": "[email protected]",
? ? "gender": "Male",
? ? "designation": "Recruiter"
? },
? {
? ? "id": "2",
? ? "first name": "Susan",
? ? "last name": "Moogan",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Executive Secretary"
? },
? {
? ? "id": "3",
? ? "first name": "Carl",
? ? "last name": "Cotesford",
? ? "email": "[email protected]",
? ? "gender": "Male",
? ? "designation": "Tech Architect"
? },
? {
? ? "id": "4",
? ? "first name": "Debor",
? ? "last name": "Sprakes",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Account Coordinator"
? },
? {
? ? "id": "5",
? ? "first name": "Priscella",
? ? "last name": "Genny",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Environmental Tech"
? }
]
        

Data in the first row became the keys for the JSON output.

.csv file without header row in the file content

Let us assume there is no header row in the content of employeeData.csv

1,Dom,Lissett,[email protected],Male,Recruiter
2,Susan,Moogan,[email protected],Female,Executive Secretary
3,Carl,Cotesford,[email protected],Male,Tech Architect
4,Debor,Sprakes,[email protected],Female,Account Coordinator
5,Priscella,Genny,[email protected],Female,Environmental Tech
        

If we try to run with the same configurations, it will take the first row as keys and console output will look like.

INFO? 2021-07-27 23:21:15,609 [[MuleRuntime].uber.03: [csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor: csv-readFlow/processors/1; event: 4254cbd0-efb3-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "1": "2",
? ? "Dom": "Susan",
? ? "Lissett": "Moogan",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Executive Secretary"
? },
? {
? ? "1": "3",
? ? "Dom": "Carl",
? ? "Lissett": "Cotesford",
? ? "[email protected]": "[email protected]",
? ? "Male": "Male",
? ? "Recruiter": "Tech Architect"
? },
? {
? ? "1": "4",
? ? "Dom": "Debor",
? ? "Lissett": "Sprakes",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Account Coordinator"
? },
? {
? ? "1": "5",
? ? "Dom": "Priscella",
? ? "Lissett": "Genny",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Environmental Tech"
? }
]
        

By default, after setting MIME type to "application/json", mule considers the first row as the header row. In such cases, we don't want to consider the first row as the header row.

  • Click on the file connector. Click on MIME Type. Click on add parameter. Set key as "header" and value as "false". Save the changes.

No alt text provided for this image

Now, by using the same file, we will be able to read data properly. But in this case, autogenerated values for keys like coloumn_0, column_1, etc. will be populated. Below is a sample log snippet.

INFO? 2021-07-27 23:35:57,158 [[MuleRuntime].uber.05: [csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor: csv-readFlow/processors/1; event: 4fbf0590-efb5-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1",
? ? "column_1": "Dom",
? ? "column_2": "Lissett",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Recruiter"
? },
? {
? ? "column_0": "2",
? ? "column_1": "Susan",
? ? "column_2": "Moogan",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Executive Secretary"
? },
? {
? ? "column_0": "3",
? ? "column_1": "Carl",
? ? "column_2": "Cotesford",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Tech Architect"
? },
? {
? ? "column_0": "4",
? ? "column_1": "Debor",
? ? "column_2": "Sprakes",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Account Coordinator"
? },
? {
? ? "column_0": "5",
? ? "column_1": "Priscella",
? ? "column_2": "Genny",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Environmental Tech"
? }
]
        

.csv file without header row and having a "|" as a delimiter

If we replace "," delimiter with "|" delimiter in employeeData.csv

1|Dom|Lissett|[email protected]|Male|Recruiter
2|Susan|Moogan|[email protected]|Female|Executive Secretary
3|Carl|Cotesford|[email protected]|Male|Tech Architect
4|Debor|Sprakes|[email protected]|Female|Account Coordinator
5|Priscella|Genny|[email protected]|Female|Environmental Tech
        

On running the same mule application with the above data, the output of the above employeeData.csv file will be.

INFO? 2021-07-27 23:42:06,938 [[MuleRuntime].uber.05: [csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor: csv-readFlow/processors/1; event: 2c2f0f20-efb6-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1|Dom|Lissett|[email protected]|Male|Recruiter"
? },
? {
? ? "column_0": "2|Susan|Moogan|[email protected]|Female|Executive Secretary"
? },
? {
? ? "column_0": "3|Carl|Cotesford|[email protected]|Male|Tech Architect"
? },
? {
? ? "column_0": "4|Debor|Sprakes|[email protected]|Female|Account Coordinator"
? },
? {
? ? "column_0": "5|Priscella|Genny|[email protected]|Female|Environmental Tech"
? }
]
        

Now, we are getting only one key-value pair for each row. By default mule considers "," as delimiter for "application/csv" MIME type data. We need to explicitly define "|" as a delimiter in parameters under the MIME type.

  • Click on the file connector. Click on MIME Type. Click on add parameter. Set Key as "separator" and Value as "|". Save the changes.

No alt text provided for this image

Now, by using the same file, we will be able to read data properly as shown below.

INFO? 2021-07-27 23:51:57,557 [[MuleRuntime].uber.06: [csv-read].csv-readFlow.CPU_INTENSIVE @6efd4175] [processor: csv-readFlow/processors/1; event: 8c31a710-efb7-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1",
? ? "column_1": "Dom",
? ? "column_2": "Lissett",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Recruiter"
? },
? {
? ? "column_0": "2",
? ? "column_1": "Susan",
? ? "column_2": "Moogan",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Executive Secretary"
? },
? {
? ? "column_0": "3",
? ? "column_1": "Carl",
? ? "column_2": "Cotesford",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Tech Architect"
? },
? {
? ? "column_0": "4",
? ? "column_1": "Debor",
? ? "column_2": "Sprakes",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Account Coordinator"
? },
? {
? ? "column_0": "5",
? ? "column_1": "Priscella",
? ? "column_2": "Genny",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Environmental Tech"
? }
]
        

References:

https://docs.mulesoft.com/mule-runtime/4.3/dataweave-formats-csv

Please share your valuable feedback ??
| LinkedIn | Twitter | Blog |
Irfan Shaik

Mulesoft Consultant at HCL Technologies

1 年

Hi Pranav Davar I have a requirement same as your last option .csv file without a header and "|" as a delimeter Tried in a same way and it didn't work for me. can you please help? This is my Input payload - 11281169|11091102|GULF COAST MARKET|2385 HIGHWAY 87|CRYSTAL BEACH|TX|77650| |4096842400|12040941614|5|0|0| |103|000980|S|04|0|D|0| | |0| |01|1000| |02|0|0|0|002|1001946|2023 TIER 1 (LF)|DPTX-W| |4416700000|0|0041|0|0|INDEPEND GROCERY|INDEPENDENT GROCERY|355|48|46526.88|0|LF| | | |GALVESTON| 11281611|11091102|BAYVUE GROCERY|1901 HWY 87|PORT BOLIVAR|TX|77650| |4096846220|17604685069|1|0|0| |112| |A|04|0|D|0| | |0| |01|1000| |01|HOU4287|0|0|002|1002433| |DPTX-W| |4416700000|0|0040|0|0|IBC|INDEPEND CONVENIENCE|355|48|0.00|0.00|SF| | | |GALVESTON| 11282565|11091102|DANNAY DONUTS|2124 HIGHWAY 87|CRYSTAL BEACH|TX|77650|KER|7136841778|32010714270|1|0|0| |214|000068|A|04|0|D|0| | |0| |01|1000| |01|0|0|0|002|1001231|REST/BARS/TAVERNS|DPTX-W| |4416700000|0|0040|0|0|ALL OTHER|QSR OTHER|355|48|0.00|0.00|SF| | | |GALVESTON|

  • 该图片无替代文字
回复
Mathias Schroeder

Especialista SAP-ABAP | desenvolvedor Mulesoft | Engenheiro de computa??o

1 年

Very useful! ??

Mohammed Shakir

MuleSoft Developer | MuleSoft Mentor

1 年

HI,if we want to define multiple delimiters then how do we do it like | and ;(semi-colon)etc

回复
Iskandar George

Solution Delivery Center Manager | Driving Operational Excellence & Client Satisfaction | Driving Operational Excellence | Strategic Planning & Execution | Process Improvement | Team Leader & Mentor

2 年

how to remove the autogenerated keys "column_0, columns_1,..."?? I need the value only

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了