Azure Data Lake – U-SQL – SELECT Transformation Rowsets
JOYDEEP DAS
Project Manager at Cognizant | MVP - Data Platform | MVB DZone | Azure Data Engineer Associates | Azure DataBricks | PySpark | Spark | Azure DevOps
Introduction
In this article we are trying to discuss about basic U-SQL SELECT query Transformation Rowsets Technique. Hope it will be informative.
What is U-SQL SELECT query Transformation Rowsets
In my previous article we are just retrieving the data from “SearchLog.tsv” file to “SearchLog-scalar-variables.csv”. It is just a simple file to file movement of data.
Now we have to think some extra operation on the Row set data before storing it into final destination. For example: some filtration, Grouping, aggregate function in numerical values etc.
When we are going to perform such kind of operation with Row set before moving it to the destination, it is called SELECT Transformation Rowsets.
Let’s Take a Simple Example to understand it
We are here taking the same example that we used previous article to simply copy data from a file named “SearchLog.tsv” and store it into “SearchLog-scalar-variables.csv”.
What we are doing in the Transformation part is, we are just filtering the Row set region wise. We mean to say we are using a Boolean expression in the WHERE clause of SELECT statement. Feeling quite comfortable ….