Pig Latin and its Operators
Apache Pig Latin

Pig Latin and its Operators

What is Pig Latin?

While we need to analyze data in Hadoop using Apache Pig, we use Pig Latin language. Basically, first, we need to transform Pig Latin statements into MapReducejobs using an interpreter layer. In this way, Hadoop process these jobs.

However, we can say, Pig Latin is a very simple language with SQL like semantics. It is possible to use it in a productive manner. It also contains a rich set of functions. Those exhibits data manipulation. Moreover, by writing user-defined functions (UDF) using Java, we can extend them easily. That implies they are extensible in nature.

Learn more in detail about Apache Pig introduction

Data Model in Pig Latin

The data model of Pig is fully nested. In addition, the outermost structure of the Pig Latin data model is a Relation. Also, it is a bag. While?

  • A bag, what we call a collection of tuples.
  • A tuple, what we call an ordered set of fields.
  • A field, what we call a piece of data.

Statements in Pig Latin

Also, make sure, statements are the basic constructs while processing data using Pig Latin.

  • Basically, statements work with relations. Also, includes expressions and schemas.
  • Here, every statement ends with a semicolon (;).
  • Moreover, through statements, we will perform several operations using operators, those are offered by Pig Latin.
  • However, Pig Latin statements take a relation as input and produce another relation as output, while performing all other operations Except LOAD and STORE.
  • Its semantic checking will be carried out, once we enter a Load statement in the Grunt shell. Although, we need to use the Dump operator, in order to see the contents of the schema. Because, the MapReduce job for loading the data into the file system will be carried out, only after performing the dump operation.

Let us see Apache Pig Installation on Ubuntu

Pig Latin Example –

Here, is a Pig Latin statement. Basically, that loads data to Apache Pig.

  1. grunt> Employee_data = LOAD 'Employee_data.txt' USING PigStorage(',')as

  2. ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );

Pig Latin Data types

Further, is the list of Pig Latin data types. Such as:

  • int

“Int” represents a signed 32-bit integer.

For Example: 10

  • long

It represents a signed 64-bit integer.

For Example: 10L

  • float

This data type represents a signed 32-bit floating point.

For Example: 10.5F

  • double

“double” represents a 64-bit floating point.

For Example: 10.5

  • chararray

It represents a character array (string) in Unicode UTF-8 format.

For Example: ‘Data Flair’

  • Bytearray

This data type represents a Byte array (blob).

  • Boolean

“Boolean” represents a Boolean value.

For Example : true/ false.

Note: It is case insensitive.

  • Datetime

It represents a date-time.

For Example : 1970-01-01T00:00:00.000+00:00

  • Biginteger

This data type represents a Java BigInteger.

For Example: 60708090709

  • Bigdecimal

“Bigdecimal” represents a Java BigDecimal

For Example: 185.98376256272893883

Let us see Top 3 Apache Pig Books Advised By Pig Experts

i.Complex Types

  • Tuple

An ordered set of fields is what we call a tuple.

For Example : (Ankit, 32)

  • Bag

A collection of tuples is what we call a bag.

For Example : {(Ankit,32),(Neha,30)}

  • Map

A set of key-value pairs is what we call a Map.

Example : [ ‘name’#’Ankit’, ‘age’#32]

ii. Null Values

It is possible that values for all the above data types can be NULL. However, SQL and Pig treat null values in the same way.

On defining a null Value, It can be an unknown value or a non-existent value. Moreover, we use it as a placeholder for optional values. Either, These nulls can be the result of an operation or it can occur naturally.

 Pig Latin Arithmetic Operators

Here, is the list of arithmetic operators of Pig Latin. Let’s assume,value of A = 20 and B = 40.

  • +

Addition ? It simply adds values on either side of the operator.

For Example: 60, it comes to adding A+B.

  • ?

Subtraction – This operator subtracts right-hand operand from left-hand operand.

For Example: ?20, it comes on subtracting A-B

  • *

Multiplication ? It simply Multiplies values on either side of the operators.

For Example: 800, it comes to multiplying A*B.

  • /

Division ? This operator divides left-hand operand by right-hand operand

For Example: 2, it comes to dividing, b/a

  • %

Modulus ? It Divides left-hand operand by right-hand operand and returns the remainder

For Example: 0, it comes to dividing, b % a.

  • ? :

Bincond ? This operator evaluates the Boolean operators. Generally, it has three operands. Such as:

variable x = (expression) ?, value1 if true or value2 if false.

For Example:

  1. b = (a == 1)? 20: 40;
  2. if a = 1 the value of b is 20.
  3. if a!=1 the value of b is 40.
  • CASE

WHEN

THEN

ELSE END

Case ? It is equivalent to the nested bincond operator.

For Example- CASE f2 % 2

WHEN 0 THEN ‘even’

WHEN 1 THEN ‘odd’

END

Read Complete Article>>

See Also -


要查看或添加评论,请登录

Malini Shukla的更多文章

社区洞察

其他会员也浏览了