Bridging the (Analytics) Culinary Gap - Part 3
Jessica Rudd, PhD, MPH, PStat?
Senior Data Engineer - Analytics Modernization Capability Lead @ Intuit Mailchimp | Pipelines and modeling
Let's break open the data pantry!
Happy Monday! Another week of tasty treats! Today’s Data Bytes calories: 890 words … 5 minutes.
??Join Google Developers Group Atlanta NOW to keep up to date on our 2025 event calendar! Here’s some of what’s in store (details TBD):
??What I’m reading
??What I’m working on - currently working on my edge angles in Zermatt, Switzerland (I use Carv - check it out!.
One Big Thing: Bridging the (Analytics) Culinary Gap (Part 3)
With our recipes crafted in Dataform and our sous chef Composer/Airflow orchestrating the cooking, we need a way to keep our data kitchen organized and efficient. That's where Dataplex comes in – our pantry organizer extraordinaire!
Why Dataplex is Essential for a Tidy Kitchen
Imagine a pantry where ingredients are scattered haphazardly, labels are missing, and expiration dates are a mystery. Chaos reigns! Dataplex brings order to our data pantry, ensuring our ingredients are fresh, easily accessible, and well-documented.
Here's how Dataplex keeps our data kitchen running smoothly:
Dataplex in Action: A Well-Organized Pantry
Let's see how Dataplex keeps our data kitchen tidy. Imagine we have a variety of data sources – customer orders, sales transactions, and marketing campaigns. Dataplex helps us organize these ingredients into logical shelves, making it easy to find what we need.
Setting up your pantry with Dataplex
领英推荐
Navigating your pantry
Once you have your assets organized, Dataplex provides a clear view of your data landscape.
In this screenshot, we see how Dataplex automatically discovers the number of tables and total data size of our the assets in our lake. In Dataplex Discover (search) we can easily navigate through our data pantry, finding the specific ingredients (tables) we need for our ETL recipes.
Dataplex also provides detailed metadata for each asset, like a comprehensive nutritional label.
This metadata helps us understand the origin, quality, and transformation history of our data, ensuring we're using the freshest and most reliable ingredients in our recipes.
With Dataplex as our pantry organizer, our data kitchen is a model of efficiency and order. We can easily find, understand, and manage our data assets, ensuring our ETL pipeline delivers delicious insights to our stakeholders.
Helpful Resources
?? Sweet & Sour Candy (this week’s good, bad, or weird of the tech world)
?? Mark Zuckerberg gave Meta's Llama team the OK to train on copyrighted works, filing claims | TechCrunch - Plaintiffs in a copyright lawsuit against Meta allege that CEO Mark Zuckerberg approved the use of a known pirated dataset, LibGen, to train the company's Llama AI models, despite internal concerns about copyright infringement. Meta is accused of stripping copyright information from the data and using torrenting to access and distribute the pirated works, further concealing their alleged infringement.
?? A foundation model of transcription across human cell types - Nature - Researchers have developed a new model called GET (general expression transformer) that accurately predicts gene expression across 213 human cell types using only chromatin accessibility data and sequence information. GET's adaptability across different sequencing platforms and assays enables the discovery of universal and cell-type-specific transcription factor interaction networks and facilitates regulatory inference across a broad range of cell types and conditions.
?? One last bite
"You will do foolish things, but do them with enthusiasm." ~Sidonie-Gabrielle Colette
Thank you for reading Data Bytes. This post is public, so feel free to share it.
Thanks for reading Data Bytes! Subscribe for free to receive new posts and support my work.