Side Project - Staging view: Cheaper is Better
Scheduled
Here we are again, trying to scrape every possible argentinian peso.
Last week I commented on how, to move data from the "bronze" layer (a.k.a. a bunch of json files generated every 30 seconds), I had used a Durable Azure Function with a timer trigger. Once a day, it would load the ~0.14GB to a staging layer within the data lake in a more compact format. This move was convenient for two reasons:
Using a Raspberry Pi directly for compute has similar considerations to using a VM. Therefore, to make use of it, I did so in conjunction with Podman and cron to schedule the tasks.
As a result, while there wasn't an improvement in execution times or a significant reduction in costs, we made better use of the compute resources we already have available and had a more fun time tinkering with the Raspberry Pi (which is always fun).
Now, with the staging layer consolidating at a rate of 0.4MB per day, we can continue modeling on the staging layer with the idea of making some use of the data, rather than just watching it increase day by day.
Previous posts:
Enterprise Data Solutions: Business Intelligence and Analytics | Microsoft Azure Data | Microsoft Fabric & Power BI | CCH? Tagetik | ERP and CPM
1 周That's awesome, Ignacio. I also agree. It's fun to get hours-deep into a fun project, fix an issue, build something great, or just try something different.