???? ???????????????? ?????????????? 5 ???????????? ??????????
?? Christophe Hervouet
Stratégie et Conseil DATA (plateformes de données BI modernes / organisation / gouvernance / architectures) -------- Modern BI Data Platforms Advisor (Organization/ Governance and Architectures)
??A DATA SAAS “turnkey” data platform for your BI & ML projects
??Strong integration between each workloads & artifacts
??All resources are stored on data mesh places : The Workspaces (belongs a domains, organization, sharing)
??80% on the WEB (missing some pro dev & desktop power BI features)
But no worries direct lake model is manage able on the web
??All Fabric data is stored on a same place called One Lake
??For lakehouses there is an one lake files explorer to manage==> files (csv , parquet) & sql tables delta lake files
??Pricing : Reservation OR Pay as you go, you can pause capacities (Provide COMPUTE Capacity Units quota depending on SKU per 30 sc)
??Pricing : One Lake data storage
??You can scale down or scale up a capacity
?? I like data tools proposals regarding personas
?? I appreciate both LH organization (LH + SQL end point + Semantic model)
& DWH organization (DWH + Semantic model)
??This SAAS offer a fantastic orchestrator tool called Data Factory pipeline (Aka the snake)
?? Pro or cons : "All my eggs" on the same basket like for SAP ERP ... big debate
ETC ...
Realy time ? never tested
ML ? never tested
Copilot ? never tested , hopping that will provide BEST Practices controls & data quality checks
Real time ? (EventStreams , KQL databases) never tested up to now , hoping a strong Kafka (+eventhub) strong support
?? Good points: Power BI , Datawarehouses & Lakehouses & Shortcuts & notebooks & the Direct lake access mode , all are very great features
??If we had to quibble it would not be bad to get record & array types SQL columns on LH & DWH
??And at the same time to authorize "shared" SQL views (LH & DWH) to access the underlying data (LH & DWH) same Workspace of course <== Only a tenth of UPN on bronze/staging/raw layers to manage
?? Of course these Authorized views are share able & shortcut able
??Pain points & not super good points up to now are "low code "ingestion E.L (Extract & load) tools are and Transformations area , on my opinion
???????? Data engineering area needs a strong improvement to be relevant on big data projects ( How to ingest a Nasdaq company daily sales BI data ingestions for example ? ..certainly not via notebooks or via dfgen2 & datafactory both current status
Also Transform data via stored procedures ? not sure )
?????? ??.?? ???? ?????? ???????? ???? ?????????:
??????????????????? ??????2
??There are up to now ,missing features like interface management schema-less or not <== "the interface contact"
??We need incremental ingestion (columns values or files system date) & deduplication (regarding primary keys) via native supports (no workarounds)
领英推荐
??Offer forms to deal with sources API (continuation token , URL parameters , Body parameters , chunk) and generate M functions
?? Simplify API (sales force , Graph , O365) ingestions via above low code forms
?? Unnest / Flatten or Not the JSON arrays (no = no worries fill in DWH/LH SQL schema arrays & records types columns )
??Offer forms SDK to community for custom above forms & connectors creations ==> target : generate M functions
Before offering these custom creations on Fabric & PBI Desktop , we'll waiting Microsoft to certified these ones (levels : "community" , "microsoft ..)
??DFGen2 can accept parameters from Datafactory orchestrator & return values
??And we can use ???????? ?????????????? (first version on 2015, you see) copy task
?? DF & copy need a strong grooming to be modern data stack compliant
Expectation: a new “low code” workload like Airbyte or Fivetran
?? Offer forms to deal with sources API (continuation token , URL parameters , Body parameters , chunk)
?? Simplify API (sales force , Graph , O365) ingestions via above low code forms
?? Offer forms SDK to community for custom above forms & connectors creations
OR perhaps integrating Airbyte (Low code E.L champion) as a workload can be a solution
?????? "??: ????????????????????????????" ???? ???? ?????????????? ???????????????? ?????????????? ?????? ???? , ?????????????????? ???????? ??????
??We manipulate SQL queries (views) or notebooks (sql & pyspark) or SQL Stored procedures to perform the job, but it sounds a little poor on 2024 for all Mr Kimball patterns for SQL star & dimensional MODELS (full , incremental , append Fact table , merge Dimensions, deduplicate , snapshot , slow change dim , denormalize dimensions )
??Data quality checks
??Expectation: a new workload like DBT
OR perhaps integrating DBT (Transformation champion) as a workload can be a solution
?? Architectes & Administrators : We are also waiting for these features for big production projects
??A Terraform integration for infra as code==> Workspaces and workloads automatic setups
??DEVOPS & CICD : All artifacts can use Deployment Pipeline & Git synchronizations OR Azure DEVPOS (dedicated low code tasks)
??DATAOPS : Waiting for Best practices metadata checks
??Security : Offer shared data sources to all ingestion artifacts on all Workspaces (avoid to transmit Servers@ , credentials , secrets & key vault access) to several persons
--Cloud & On premise data sources setups on a separate place on my Fabric tenant <== dedicated access officers , a small team
??Full of Admin Fabric APIs ( scanner all artifacts , access , activities , capacities consumptions<== CU overages , debt (+/- ,balance sheet) , throttling penalies ,artifacts consumptions interactive & background )
??All Fabric WS activities/operations can be tracked by Log Analytics
??Power BI :
On 2024 we need also (after Lakehouses & DWH) to offer Power BI "single source of truth" Semantic models (Aka Data products) <= datamesh / domains type governance / avoid lot of duplicates
?? Impossible without a more robust & efficient & zero DAX limitations , composite technology
Waiting urgently for a composite models (Direct query to PBI semantic models) new gen 2 version
SAP Data Engineer(Arch/Dev) & Business/Functional/Data Analyst {Finance, OTC, P2P, R2R, SF, S/4, BW/4, CDS, SAC, BO, HANA, ABAP/AMDP, Fiori/UI5}
8 个月?? Christophe Hervouet Thank you fot spreading knowledge and insights ??