DATA Pill #068 - Amazon S3, Athena & AWS Glue ??Iceberg, ClickHouse ?? DuckDB = OLAP2
Hi,
This week, the internet again flooded us with a lot of tutorials and a ton of hot news on top.
Llama, AWS, announcements made by Google and much more in this week’s DATA Pill.
Are you ready?
ARTICLES
Zero Configuration Service Mesh with On-Demand Cluster Discovery | 9 min | Cloud | David Vroom, James Mulcahy, Ling Yuan, Rob Gulewich | Netflix TechBlog
How Netflix worked with Kinvolk and the Envoy community on on-demand cluster discovery - a feature that streamlines service mesh adoption in complex microservice environments.
Less data, less problems: Airbyte’s column selection is finally here | 14 min | dbt | Jakub Szafran | GetInData | Part of Xebia Blog
Airbyte 0.50 introduces platform changes, including checkpointing, automatic schema propagation and highly anticipated column selection. To address community demand, the GetInData team conducted tests on this feature, exploring issues such as? column extraction and CDC incremental ingestion handling. Find detailed insights in this blog post.
TUTORIALS
ClickHouse ?? DuckDB = OLAP2 | 4 min | BigData | Lorenzo Mangani | qryn dev
Explore the seamless integration of ClickHouse and DuckDB in the OLAP ecosystem through the innovative tool Quackpipe. This tutorial demonstrates how Quackpipe enables effortless data exchange between these two platforms, offering both installation guidance and exciting use cases, highlighting the collaborative power of ClickHouse and DuckDB for data analytics and manipulation.
AWS users: Amazon S3, Athena & AWS Glue ?? Iceberg | 15 min | Data Engineering | Anna Geller | AWS in Plain English?
This tutorial will walk you through the process of initiating Apache Iceberg on AWS. After reading, you will have the proficiency to generate Iceberg tables, manipulate data stored in S3 in Parquet format, execute SQL queries on data and table details, and efficiently oversee data ingestion.
In MORE LINKS you will find using MLflow AI Gateway and Llama 2 to Build Generative AI Apps and high-performance computing on AWS
NEWS
OpenTF Announces Fork of Terraform | 5 min | Cloud | OpenTF Blog
HashiCorp changed the license for their core products, including Terraform, to BSL. In response, the community crafted the OpenTF manifesto, garnering support from 100+ companies, 10 projects and 400 individuals to create OpenTF.
Introducing Code Llama, a state-of-the-art large language model for coding | 6 min | LLM | Meta AI Engineering
Let’s explore the capabilities and implications of Code Llama, a Large Language Model designed to revolutionize coding practices. Code Llama is an LLM capable of generating code, and natural language about code, from both code and natural language prompts. In benchmark testing, Code Llama outperformed state-of-the-art publicly available LLMs on code tasks. Let's find out more.
领英推荐
In MORE LINKS you will find supercharging Vertex AI with Colab Enterprise and MLOps for generative AI
DATA TUBE
Achieving success with automation in enterprise architecture in big size digital transformation | 58 min | Data Architecture | Garima Singh | Iasa Official
The talk will focus on Garima’s experience and journey in executing company-wide digital transformation, in decentralized and globally distributed big size enterprises, with the help of automated versions of enterprise architecture.
PODCAST
How Azure Embraces Terraform For Infrastructure As Code | 46 min | Cloud | Hosts: Ned Bellavance, Ethan Banks; Guests: Mark Gray, Steven Ma | Day Two Cloud Podcast
Delve into the world of Infrastructure as Code (IaC) with Microsoft's Mark Gray and Steven Ma. Discover how Microsoft is embracing Terraform to enhance its Azure offerings, including the Terraform Export Tool, the AzAPI Provider and the thriving Terraform in the Azure community. Explore the collaboration between Microsoft and HashiCorp, learn about the tool's capabilities and gain insights into the future of Terraform on Azure.
In MORE LINKS you will listen to episode about navigating event streaming
CONFS EVENTS AND MEETUPS
Build a Modern Data Stack with dbt and Databricks | Online | 26th September 2023
In this live hands-on workshop, you’ll follow a step-by-step guide to achieving production-grade data transformation using dbt Cloud with Databricks. You’ll build a scalable transformation pipeline for analytics, BI and ML – entirely from scratch.
?
You’ll learn how to:
________________________
Have any interesting content to share in the DATA Pill newsletter?
? Join us on GitHub
? Dig previous editions of DataPill?
Adam from the GetInData | Part of Xebia Xebia
Entrepreneur | CEO @qxip @gigapipe | Telecom Observability
1 年Thanks for the mention Adam Kawa! I'm glad you found our research on ClickHouse ?? DuckDB OLAP2 interesting!
Global VP & Chief Architect IKEA INGKA GROUP | ISO standards co-author | "Visionary of the year" Sweden national awardee | Nordic data professional of year 2023 awardee |10+ years in automotive |International Speaker
1 年Thank you Adam Kawa for featuring my article on enterprise architecture in your blogs :) i am truly honoured ??