Christopher Bergh的动态

CEO & Head Chef, DataKitchen: observe & automate every Data Journey so that data teams find problems fast and fix them forever! Author: DataOps Cookbook, DataOps Manifesto. Open Source Data Quality & Observability!

9 个月

You can (and should) start your DataOps / Data Quality Improvements without buying any software 4 Easy Ways to Start DataOps Today https://bit.ly/3KxCI16 #dataops

4 Easy Ways to Start DataOps Today

medium.com

要查看或添加评论，请登录

最相关的动态

DataKitchen

7,281 位关注者
9 个月
举报此动态
You can (and should) start your DataOps / Data Quality Improvements without buying any software 4 Easy Ways to Start DataOps Today https://bit.ly/3Kx9GPd #dataops

4 Easy Ways to Start DataOps Today

medium.com
赞评论
要查看或添加评论，请登录
Gil Benghiat

Founder, VP of Products & Implementation @ DataKitchen
9 个月
举报此动态
You can (and should) start your DataOps / Data Quality Improvements without buying any software 4 Easy Ways to Start DataOps Today https://bit.ly/4bQZfSO #dataops

4 Easy Ways to Start DataOps Today

medium.com

1 条评论
赞评论
要查看或添加评论，请登录
Flatfile

7,277 位关注者
10 个月
举报此动态
?? If you want to have data that your business can run on and trust, understanding and overcoming data import challenges is critical. Here are the six most common data file import challenges that businesses face and what you can do to resolve them: https://buff.ly/3vUxngc #developers #dev #datamanagement #developertools #csv

Top 6 data file import challenges that companies face

flatfile.com
赞评论
要查看或添加评论，请登录
Juicedata

301 位关注者
2 个月
举报此动态
Jerry, a US tech company, tackled the challenges of #EndToEndTesting in #DataManagement by using #JuiceFS #snapshots with #ClickHouse #DatabaseCloning. This innovative approach ensures data accuracy, reduces issues, and promotes quality assurance across data systems. ?? ??Learn more https://lnkd.in/gf2wUEuK #DistributedFileSystem #DistributedStorage #DataStorage #CloudFileSystem #SoftwareRelease #FileStorageSystem

Database Release and End-to-End Testing: Bringing Modern Software Development Best Practices to the Data World

juicefs.com
赞评论
要查看或添加评论，请登录
Basel Rabia

Senior Backend Software Engineer | Golang | PHP
9 个月
举报此动态
dual vs single write
Francesco Tisiot

Field CTO @ Aiven | Data and AI | Open Source | Streaming | Databases
10 个月

A tale of dual writes and (maybe) broken data pipelines. Last year I went to a known brand shop for the seasonal car tyre change????. I was welcomed by the crew and, after the tyre swap appointment was confirmed, I gave them the key ??. The person at the counter told me I would receive a call ?? once the car would be ready. While walking out I was pleased to immediately receive a message on my phone ?? about my car being "ready for service" <- +1 for proactive notification I then went to a nearby coffee shop to sip an espresso ?? and work while waiting for the car to be ready. I wait ?? 30 minutes, 45 minutes, 1 hour, 1 hour and 30 minutes without any call or further communication. Then decide to walk back to the desk and ask for information where I got told that, despite me receiving the initial notification, the car was never set in the proper status for the back office to pick up the work ??. They were relying on a human to BOTH update the status on a notification system, and in a back-office system manually. The latter never happened causing my car to not be serviced. This might be a silly example, and I was lucky enough to find another appointment in a couple days, but showcases a problem with dual writes. No matter if it's a human or some sort of code in the middle, when using dual writes you are at risk of inconsistencies. Anything can happen between the first write (notification system) and the second one (back-office), leaving you with some information propagated to some systems but not to all of them. Nowadays this is a solvable problem, writing only once to a backend offering ACID guarantees (like #PostgreSQL) and using reliable change data capture technologies that can propagate the change to downstream systems (like #Debezium, #ApacheKafka and #ApacheFlink). If your data needs to go in multiple locations, please write it once and create a robust data pipeline that can move the data from where it's stored to where it needs to land. My day, after the lovely espresso, wasn't great. By avoiding dual writes and building solid data pipelines you can provide a delightful user experiences and make your customers fully enjoy their cup of coffee (please no cappuccino after 11).
赞评论
要查看或添加评论，请登录
ISG Software Research

5,021 位关注者
10 个月
举报此动态
Enterprises needing specialized software that provides an environment for developing and delivering data as a product should check out #DataOps.live,?says analyst Matt Aslett. Learn more: https://bit.ly/3QEDGMH @DataOpslive #DataIntelligence #SoftwarePlatform?

DataOps.live Enables the Development of Data Products

mattaslett.ventanaresearch.com
赞评论
要查看或添加评论，请登录
CloverDX

2,444 位关注者
4 个月已编辑
举报此动态
Getting client data onboarded can be a time-consuming and frustrating process. If: ?? You're dealing with varied files from multiple clients ?? You don't want to ask your clients to reformat their data to fit your specs ?? You're having to rely on a developer or engineering team to do the work, and it's slowing down the process ?? It's really hard to figure out exactly what went wrong when errors occur Then a data ingestion tool could help. What are the key things you should be looking for in a potential new platform? We put together this list of features that could help you streamline your process and reduce manual work, so you can get data ingested faster and more reliably: Data ingestion tools: 7 features you should look for: https://lnkd.in/eqywk8TP

Data ingestion tools: 7 features you should look for

cloverdx.com
赞评论
要查看或添加评论，请登录
DevBlogIt

629 位关注者
7 个月
举报此动态
Our latest blog post breaks down Transactional Databases—an essential concept for anyone working with data-driven applications. ?? Read the full blog here: [ https://lnkd.in/eg2gZAJQ ] ?? Don't forget to like, comment, and share your thoughts. Let's dive into the world of transactional databases together! #DataManagement #TransactionalDatabases #ACIDProperties #DatabaseSystems #TechBlog #DataDriven

What are Transactional Databases?

https://devblogit.com
赞评论
要查看或添加评论，请登录
Ansam Yousry

Data Engineer | Technical Writer | SQL | Apache Airflow | Dbt| Ex-Orange | Ex-Vodafone| Ex-Etisalat
7 个月
举报此动态
Our latest blog post breaks down Transactional Databases—an essential concept for anyone working with data-driven applications. Read the full blog here: [ https://lnkd.in/efdfUB2u ] Don't forget to like, comment, and share your thoughts. Let's dive into the world of transactional databases together! #DataManagement #TransactionalDatabases #ACIDProperties #DatabaseSystems #TechBlog #DataDriven

What are Transactional Databases?

https://devblogit.com
赞评论
要查看或添加评论，请登录
Imply

17,005 位关注者
8 个月
举报此动态
An upsert is a database operation that will update an existing row if a specified value already exists in a table, or insert a new row if the specified value doesn’t already exist. Upserts are particularly useful for two key reasons: 1. Data Consistency. Upserts ensure that data remains consistent by either updating existing records or adding new ones. 2. Efficiency. Upserts combine the insert and update operations into a single action, reducing the number of database queries and improving performance. Our new blog discusses how we support upserts in Imply Polaris for a variety of use cases: https://bit.ly/3LhQZ2A

Using Upserts in Imply Polaris

imply.io
赞评论
要查看或添加评论，请登录

4,461 位关注者

查看档案关注

Christopher Bergh的动态

4 Easy Ways to Start DataOps Today

medium.com

更多文章

The DataOps Manifesto Reaches 5,000 Signatories and Now Translated To 14 Languages

Warring Tribes into Winning Teams: Improving Teamwork in Your Data Organization

Four Great DataOps Articles