Columnar file format

RC File(Row Columnar):

Behavior: These are flat files consisting of binary key value pairs.

Read/write: RC developed for Faster read but compromise with write performance.

Compression: Provides significant block compression can be compressed with high compression ratio.

Splittable: Yes

Schema evaluation: Was mainly designed for Faster read so no schema evaluation .

---------------------------------------------------------------------------------------------------------

Parquet file: 

Behavior: Parquet stores nested data structure in flat columnar format.

Read/Write: good for Faster read.

Compression: Support compression mostly with snappy algorithm.

Splittable:: Parquet file are conditionally splittable.

Schema evaluation: Limited schema evaluation.

# Bigdata

Deepanshu Bowade

Associate at Deutsche Bank

3 年

Very informative

要查看或添加评论,请登录