DAfR - Data Architect for Real {7}

DAfR - Data Architect for Real {7}

~ 1

Gathering the right data is as crucial as asking the right questions

No matter how advanced your infrastructure & architecture is

~ 2

Make data governance a priority?

Data must be high-quality, of high relevance, and targeted to specific business need

~ 3

Talking about databases, also in 2022 it's essential to keep in mind that minimizing the size of data types shortens the row length, which leads to better query performance?

Please, use the smallest data type that works for your data

~ 4

Cleaning data is a key data skill because data naturally comes in messy and imperfect forms

~ 5

Choose a data pipeline orchestration tool

Most data solutions consist of repeated data processing operations, encapsulated in workflows

A pipeline orchestrator is a tool that helps to automate these workflows

Start by answering these questions:

  • Do you need big data capabilities for moving and transforming your data?
  • Do you require a managed service that can operate at scale?
  • Are some of your data sources located on-premises?
  • Is your source data stored in an HDFS filesystem?

要查看或添加评论,请登录

Andrea Benedetti的更多文章

社区洞察

其他会员也浏览了