???? ???????????????? ?????????????? 5 ???????????? ??????????

???? ???????????????? ?????????????? 5 ???????????? ??????????


??A DATA SAAS “turnkey” data platform for your BI & ML projects

??Strong integration between each workloads & artifacts

??All resources are stored on data mesh places : The Workspaces (belongs a domains, organization, sharing)

??80% on the WEB (missing some pro dev & desktop power BI features)

But no worries direct lake model is manage able on the web

??All Fabric data is stored on a same place called One Lake

??For lakehouses there is an one lake files explorer to manage==> files (csv , parquet) & sql tables delta lake files

??Pricing : Reservation OR Pay as you go, you can pause capacities (Provide COMPUTE Capacity Units quota depending on SKU per 30 sc)

??Pricing : One Lake data storage

??You can scale down or scale up a capacity

?? I like data tools proposals regarding personas

?? I appreciate both LH organization (LH + SQL end point + Semantic model)

& DWH organization (DWH + Semantic model)

??This SAAS offer a fantastic orchestrator tool called Data Factory pipeline (Aka the snake)

?? Pro or cons : "All my eggs" on the same basket like for SAP ERP ... big debate

ETC ...

Realy time ? never tested

ML ? never tested

Copilot ? never tested , hopping that will provide BEST Practices controls & data quality checks

Real time ? (EventStreams , KQL databases) never tested up to now , hoping a strong Kafka (+eventhub) strong support



?? Good points: Power BI , Datawarehouses & Lakehouses & Shortcuts & notebooks & the Direct lake access mode , all are very great features

??If we had to quibble it would not be bad to get record & array types SQL columns on LH & DWH

??And at the same time to authorize "shared" SQL views (LH & DWH) to access the underlying data (LH & DWH) same Workspace of course <== Only a tenth of UPN on bronze/staging/raw layers to manage

?? Of course these Authorized views are share able & shortcut able




??Pain points & not super good points up to now are "low code "ingestion E.L (Extract & load) tools are and Transformations area , on my opinion

???????? Data engineering area needs a strong improvement to be relevant on big data projects ( How to ingest a Nasdaq company daily sales BI data ingestions for example ? ..certainly not via notebooks or via dfgen2 & datafactory both current status

Also Transform data via stored procedures ? not sure )

?????? ??.?? ???? ?????? ???????? ???? ?????????:

??????????????????? ??????2

??There are up to now ,missing features like interface management schema-less or not <== "the interface contact"

??We need incremental ingestion (columns values or files system date) & deduplication (regarding primary keys) via native supports (no workarounds)

??Offer forms to deal with sources API (continuation token , URL parameters , Body parameters , chunk) and generate M functions

?? Simplify API (sales force , Graph , O365) ingestions via above low code forms

?? Unnest / Flatten or Not the JSON arrays (no = no worries fill in DWH/LH SQL schema arrays & records types columns )

??Offer forms SDK to community for custom above forms & connectors creations ==> target : generate M functions

Before offering these custom creations on Fabric & PBI Desktop , we'll waiting Microsoft to certified these ones (levels : "community" , "microsoft ..)

??DFGen2 can accept parameters from Datafactory orchestrator & return values


??And we can use ???????? ?????????????? (first version on 2015, you see) copy task

?? DF & copy need a strong grooming to be modern data stack compliant

Expectation: a new “low code” workload like Airbyte or Fivetran

?? Offer forms to deal with sources API (continuation token , URL parameters , Body parameters , chunk)

?? Simplify API (sales force , Graph , O365) ingestions via above low code forms

?? Offer forms SDK to community for custom above forms & connectors creations

OR perhaps integrating Airbyte (Low code E.L champion) as a workload can be a solution


?????? "??: ????????????????????????????" ???? ???? ?????????????? ???????????????? ?????????????? ?????? ???? , ?????????????????? ???????? ??????

??We manipulate SQL queries (views) or notebooks (sql & pyspark) or SQL Stored procedures to perform the job, but it sounds a little poor on 2024 for all Mr Kimball patterns for SQL star & dimensional MODELS (full , incremental , append Fact table , merge Dimensions, deduplicate , snapshot , slow change dim , denormalize dimensions )

??Data quality checks

??Expectation: a new workload like DBT

OR perhaps integrating DBT (Transformation champion) as a workload can be a solution



?? Architectes & Administrators : We are also waiting for these features for big production projects

??A Terraform integration for infra as code==> Workspaces and workloads automatic setups

??DEVOPS & CICD : All artifacts can use Deployment Pipeline & Git synchronizations OR Azure DEVPOS (dedicated low code tasks)

??DATAOPS : Waiting for Best practices metadata checks

??Security : Offer shared data sources to all ingestion artifacts on all Workspaces (avoid to transmit Servers@ , credentials , secrets & key vault access) to several persons

--Cloud & On premise data sources setups on a separate place on my Fabric tenant <== dedicated access officers , a small team

??Full of Admin Fabric APIs ( scanner all artifacts , access , activities , capacities consumptions<== CU overages , debt (+/- ,balance sheet) , throttling penalies ,artifacts consumptions interactive & background )

??All Fabric WS activities/operations can be tracked by Log Analytics



??Power BI :

On 2024 we need also (after Lakehouses & DWH) to offer Power BI "single source of truth" Semantic models (Aka Data products) <= datamesh / domains type governance / avoid lot of duplicates

?? Impossible without a more robust & efficient & zero DAX limitations , composite technology

Waiting urgently for a composite models (Direct query to PBI semantic models) new gen 2 version

Moulay Salah , Idrissi

SAP Data Engineer(Arch/Dev) & Business/Functional/Data Analyst {Finance, OTC, P2P, R2R, SF, S/4, BW/4, CDS, SAC, BO, HANA, ABAP/AMDP, Fiori/UI5}

8 个月

?? Christophe Hervouet Thank you fot spreading knowledge and insights ??

要查看或添加评论,请登录

?? Christophe Hervouet的更多文章

社区洞察

其他会员也浏览了