登录查看更多内容

Reading .csv files with and without headers using different delimiters in Mule 4

Pranav Davar

MuleSoft Ambassador | MuleSoft Bangalore Meetup Leader | MuleSoft | DevOps| Azure |Kubernetes| APIM | Middleware | Oracle Integration Cloud | Oracle SOA/OSB 12c

发布日期: 2021年7月28日

Flat files especially .csv files are very common for data loading and data transfers. We generally come across scenarios where we want to read a .csv file, which may or may not contain the header row. Also, there is a chance .csv files don't have ','(comma) as delimiter rather has ''|"(pipe) as a delimiter or any other character as a delimiter. This article will discuss various ways to read such files in Mule 4 and create a JSON output of it.

Mule Flow

This will be a simple flow, where we will be reading a file using a file connector. Then convert the read data to JSON using transform message. Finally log the JSON payload to console using Logger. Here, we are not doing any complex transformation, it will be more like a MIME type conversion from application/csv to application/json, for better understanding the structure of input data.

Steps:

Create a flow in your mule application.
Drag and drop file connector with operation "On New or Updated file" on the source of mule flow.
In file connector properties under the general tab, specify or create a new connector configuration. Provide input directory, polling frequency(here it is kept as 10 seconds), specify the post-processing action like deleting the file, moving the file, or renaming the file. As per the below configurations, the mule application will look for a file in the inbound folder every 10 seconds. If it finds any file, it will process and move the file to the archive directory deleting the file from the inbound location.

Drag and drop "Transform Message" from mule palette, and have output mime type as "application/json" as below.

Drag and drop logger from mule palette. Here we will be logging payload after transformation.
The final mule flow will look as below.

.csv file having header row in the file content

Let us take below employeeData.csv file as an example. For simplicity, we are taking dummy data of only 5 employees.

id,first name,last name,email,gender,designation
1,Dom,Lissett,[email protected],Male,Recruiter
2,Susan,Moogan,[email protected],Female,Executive Secretary
3,Carl,Cotesford,[email protected],Male,Tech Architect
4,Debor,Sprakes,[email protected],Female,Account Coordinator
5,Priscella,Genny,[email protected],Female,Environmental Tech

In this example, data in the first row signifies the headers separated by "," as a delimiter. Rest all other rows represent employee data.

Since the above .csv file contains a header row, all we need to do is to set the MIME type of file connector as "application/csv". To do so, click on the file connector. Click on MIME Type and select "application/csv" from the drop-down list of values.

Place the file as defined in the file connector. Run the application. We will see the JSON response logged in the console as shown below.

INFO? 2021-07-27 23:08:55,916 [[MuleRuntime].uber.03: [csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor: csv-readFlow/processors/1; event: 895d1200-efb1-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "id": "1",
? ? "first name": "Dom",
? ? "last name": "Lissett",
? ? "email": "[email protected]",
? ? "gender": "Male",
? ? "designation": "Recruiter"
? },
? {
? ? "id": "2",
? ? "first name": "Susan",
? ? "last name": "Moogan",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Executive Secretary"
? },
? {
? ? "id": "3",
? ? "first name": "Carl",
? ? "last name": "Cotesford",
? ? "email": "[email protected]",
? ? "gender": "Male",
? ? "designation": "Tech Architect"
? },
? {
? ? "id": "4",
? ? "first name": "Debor",
? ? "last name": "Sprakes",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Account Coordinator"
? },
? {
? ? "id": "5",
? ? "first name": "Priscella",
? ? "last name": "Genny",
? ? "email": "[email protected]",
? ? "gender": "Female",
? ? "designation": "Environmental Tech"
? }
]

Data in the first row became the keys for the JSON output.

.csv file without header row in the file content

Let us assume there is no header row in the content of employeeData.csv

1,Dom,Lissett,[email protected],Male,Recruiter
2,Susan,Moogan,[email protected],Female,Executive Secretary
3,Carl,Cotesford,[email protected],Male,Tech Architect
4,Debor,Sprakes,[email protected],Female,Account Coordinator
5,Priscella,Genny,[email protected],Female,Environmental Tech

If we try to run with the same configurations, it will take the first row as keys and console output will look like.

领英推荐

Passing Bulk Data to Stored Procedure to Get Improved…

Ciklum India 2 年前

SQL Subquery

Pintu Kumar Kushwaha 1 年前

?? A Saga of Input Data: SWMM's Input SW Tables ???…

Robert Dickinson 1 年前

INFO? 2021-07-27 23:21:15,609 [[MuleRuntime].uber.03: [csv-read].csv-readFlow.CPU_INTENSIVE @1e42a374] [processor: csv-readFlow/processors/1; event: 4254cbd0-efb3-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "1": "2",
? ? "Dom": "Susan",
? ? "Lissett": "Moogan",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Executive Secretary"
? },
? {
? ? "1": "3",
? ? "Dom": "Carl",
? ? "Lissett": "Cotesford",
? ? "[email protected]": "[email protected]",
? ? "Male": "Male",
? ? "Recruiter": "Tech Architect"
? },
? {
? ? "1": "4",
? ? "Dom": "Debor",
? ? "Lissett": "Sprakes",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Account Coordinator"
? },
? {
? ? "1": "5",
? ? "Dom": "Priscella",
? ? "Lissett": "Genny",
? ? "[email protected]": "[email protected]",
? ? "Male": "Female",
? ? "Recruiter": "Environmental Tech"
? }
]

By default, after setting MIME type to "application/json", mule considers the first row as the header row. In such cases, we don't want to consider the first row as the header row.

Click on the file connector. Click on MIME Type. Click on add parameter. Set key as "header" and value as "false". Save the changes.

Now, by using the same file, we will be able to read data properly. But in this case, autogenerated values for keys like coloumn_0, column_1, etc. will be populated. Below is a sample log snippet.

INFO? 2021-07-27 23:35:57,158 [[MuleRuntime].uber.05: [csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor: csv-readFlow/processors/1; event: 4fbf0590-efb5-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1",
? ? "column_1": "Dom",
? ? "column_2": "Lissett",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Recruiter"
? },
? {
? ? "column_0": "2",
? ? "column_1": "Susan",
? ? "column_2": "Moogan",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Executive Secretary"
? },
? {
? ? "column_0": "3",
? ? "column_1": "Carl",
? ? "column_2": "Cotesford",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Tech Architect"
? },
? {
? ? "column_0": "4",
? ? "column_1": "Debor",
? ? "column_2": "Sprakes",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Account Coordinator"
? },
? {
? ? "column_0": "5",
? ? "column_1": "Priscella",
? ? "column_2": "Genny",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Environmental Tech"
? }
]

.csv file without header row and having a "|" as a delimiter

If we replace "," delimiter with "|" delimiter in employeeData.csv

1|Dom|Lissett|[email protected]|Male|Recruiter
2|Susan|Moogan|[email protected]|Female|Executive Secretary
3|Carl|Cotesford|[email protected]|Male|Tech Architect
4|Debor|Sprakes|[email protected]|Female|Account Coordinator
5|Priscella|Genny|[email protected]|Female|Environmental Tech

On running the same mule application with the above data, the output of the above employeeData.csv file will be.

INFO? 2021-07-27 23:42:06,938 [[MuleRuntime].uber.05: [csv-read].csv-readFlow.CPU_INTENSIVE @6fbaa2fe] [processor: csv-readFlow/processors/1; event: 2c2f0f20-efb6-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1|Dom|Lissett|[email protected]|Male|Recruiter"
? },
? {
? ? "column_0": "2|Susan|Moogan|[email protected]|Female|Executive Secretary"
? },
? {
? ? "column_0": "3|Carl|Cotesford|[email protected]|Male|Tech Architect"
? },
? {
? ? "column_0": "4|Debor|Sprakes|[email protected]|Female|Account Coordinator"
? },
? {
? ? "column_0": "5|Priscella|Genny|[email protected]|Female|Environmental Tech"
? }
]

Now, we are getting only one key-value pair for each row. By default mule considers "," as delimiter for "application/csv" MIME type data. We need to explicitly define "|" as a delimiter in parameters under the MIME type.

Click on the file connector. Click on MIME Type. Click on add parameter. Set Key as "separator" and Value as "|". Save the changes.

Now, by using the same file, we will be able to read data properly as shown below.

INFO? 2021-07-27 23:51:57,557 [[MuleRuntime].uber.06: [csv-read].csv-readFlow.CPU_INTENSIVE @6efd4175] [processor: csv-readFlow/processors/1; event: 8c31a710-efb7-11eb-8134-d4d25289cb8d] org.mule.runtime.core.internal.processor.LoggerMessageProcessor: [
? {
? ? "column_0": "1",
? ? "column_1": "Dom",
? ? "column_2": "Lissett",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Recruiter"
? },
? {
? ? "column_0": "2",
? ? "column_1": "Susan",
? ? "column_2": "Moogan",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Executive Secretary"
? },
? {
? ? "column_0": "3",
? ? "column_1": "Carl",
? ? "column_2": "Cotesford",
? ? "column_3": "[email protected]",
? ? "column_4": "Male",
? ? "column_5": "Tech Architect"
? },
? {
? ? "column_0": "4",
? ? "column_1": "Debor",
? ? "column_2": "Sprakes",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Account Coordinator"
? },
? {
? ? "column_0": "5",
? ? "column_1": "Priscella",
? ? "column_2": "Genny",
? ? "column_3": "[email protected]",
? ? "column_4": "Female",
? ? "column_5": "Environmental Tech"
? }
]

References:

https://docs.mulesoft.com/mule-runtime/4.3/dataweave-formats-csv

Please share your valuable feedback ??

| LinkedIn | Twitter | Blog |

Irfan Shaik

Mulesoft Consultant at HCL Technologies

1 年

Hi Pranav Davar I have a requirement same as your last option .csv file without a header and "|" as a delimeter Tried in a same way and it didn't work for me. can you please help? This is my Input payload - 11281169|11091102|GULF COAST MARKET|2385 HIGHWAY 87|CRYSTAL BEACH|TX|77650| |4096842400|12040941614|5|0|0| |103|000980|S|04|0|D|0| | |0| |01|1000| |02|0|0|0|002|1001946|2023 TIER 1 (LF)|DPTX-W| |4416700000|0|0041|0|0|INDEPEND GROCERY|INDEPENDENT GROCERY|355|48|46526.88|0|LF| | | |GALVESTON| 11281611|11091102|BAYVUE GROCERY|1901 HWY 87|PORT BOLIVAR|TX|77650| |4096846220|17604685069|1|0|0| |112| |A|04|0|D|0| | |0| |01|1000| |01|HOU4287|0|0|002|1002433| |DPTX-W| |4416700000|0|0040|0|0|IBC|INDEPEND CONVENIENCE|355|48|0.00|0.00|SF| | | |GALVESTON| 11282565|11091102|DANNAY DONUTS|2124 HIGHWAY 87|CRYSTAL BEACH|TX|77650|KER|7136841778|32010714270|1|0|0| |214|000068|A|04|0|D|0| | |0| |01|1000| |01|0|0|0|002|1001231|REST/BARS/TAVERNS|DPTX-W| |4416700000|0|0040|0|0|ALL OTHER|QSR OTHER|355|48|0.00|0.00|SF| | | |GALVESTON|

Mathias Schroeder

Especialista SAP-ABAP | desenvolvedor Mulesoft | Engenheiro de computa??o

1 年

Very useful! ??

1 次回应

Steven F.

1 年

Excellent article!

1 次回应

Mohammed Shakir

MuleSoft Developer | MuleSoft Mentor

1 年

HI,if we want to define multiple delimiters then how do we do it like | and ;(semi-colon)etc

Iskandar George

2 年

how to remove the autogenerated keys "column_0, columns_1,..."?? I need the value only

查看更多评论

要查看或添加评论，请登录

Reading .csv files with and without headers using different delimiters in Mule 4

Pranav Davar

MuleSoft Ambassador | MuleSoft Bangalore Meetup Leader | MuleSoft | DevOps| Azure |Kubernetes| APIM | Middleware | Oracle Integration Cloud | Oracle SOA/OSB 12c

Mule Flow

.csv file having header row in the file content

.csv file without header row in the file content

领英推荐

.csv file without header row and having a "|" as a delimiter

References:

社区洞察

其他会员也浏览了

Understanding Common Table Expressions (CTEs)

How to import the Manhole Loads from every InfoSewer Scenario to Subcatchments in ICM InfoWorks using Ruby

How do I find the min, max, and peak timings of hydraulic conduit results using SQL aggregate functions in ICM InfoWorks?

How to import the Manhole Loads from Only One InfoSewer Scenario to the Subcatchments in ICM InfoWorks using Ruby

Collections in C#: Using List, IEnumerable, Array, and Dictionary for Different Scenarios

Manipulation in SQL

Creating your first API (with code)

Using SOQL Query in DataRaptor Formula

Write data from Power Apps to Fabric

Comparing SQL Subqueries and Window Functions: Differences, Use Cases, and Performance Considerations