Arroyo

软件开发

Berkeley，CA 889 位关注者

Serverless stream processing with SQL

关注

查看全部 2 位员工

关于我们

Arroyo is bringing real-time data to every company with the Arroyo Streaming Engine

网站: https://www.arroyo.dev
Arroyo的外部链接
所属行业: 软件开发
规模: 2-10 人
总部: Berkeley，CA
类型: 私人持股
创立: 2022

地点

主要

US，CA，Berkeley

获取路线

Arroyo员工

查看全部员工

动态

Arroyo

889 位关注者
1 天前
举报此动态
Arroyo 0.14.0 is now available, with some great new features, improvements, and fixes, including: ?? Lookup joins ?? Nested updating aggregates {} Struct types ?? Streaming SQL syntax ?? Sink shuffles Thanks to all of our contributors for this release and especially Ratul D. and Nathan Lapierre who had there first contributions in this release! See the full release notes on the Arroyo blog: https://lnkd.in/dyFtxtf4

Announcing Arroyo 0.14.0

arroyo.dev

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
3 周
举报此动态
Arroyo just crossed 4,000 GitHub stars! Amazing to see the community come together over the past two years to build a better stream processor. Thanks to all of our users and contributors for helping us reach this milestone!
7 条评论

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
4 周
举报此动态
The sad, dark truth of the data world is that—for all of our fancy algorithms and systems and careful performance engineering, a majority of CPU time might just go to...decoding JSON. In some other slice of the multiverse data teams have all moved to efficient formats like Avro and Protobuf, but our fallen world still runs on a sort-of-specified data serialization format extracted from a frontend programming language, itself famously created in 10 days. So if we're going to have to read JSON, we might as well do it quickly. And if you've ever been curious how we do this at Arroyo, have I got the incredibly long and in-depth explanation for you!

Fast columnar JSON decoding with arrow-rs

arroyo.dev

8 条评论

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
2 个月已编辑
举报此动态
As Arroyo has grown, our internal analytics needs have outpaced our initial, ad-hoc data infra. When it came time to rebuild it, we turned to the best technologies of the modern data stack: an object-storage based data lake queried by DuckDB. And of course, Arroyo itself to provide near-real-time ingestion. We're so happy with how this turned out, I thought it would be worth documenting for other folks looking to build an easy, cheap, near-real-time analytics system. We're calling this approach the LOAD stack, for log storage/object storage/Arroyo/DuckDB. In our deployment, we combine several managed and open-source tools to provide sub-minute access to data at a small fraction of the cost of fully-managed solutions like Databricks or Snowflake: ? AWS Lambda to get events in ? Redpanda Data Serverless to store them for processing ? Arroyo for efficient and fault-tolerant ingestion ? S3 for long-term storage ? DuckDB via AWS SageMaker Notebook for analysis At Lyft, it took an entire team two+ years to build out a comparable stack; with modern tooling, I think an individual can build something comparable in a few hours today. Find the full walkthrough in our writeup:

Building a near-real-time data lake with the LOAD stack

arroyo.dev

4 条评论

赞评论分享
Arroyo

889 位关注者
3 个月已编辑
举报此动态
The Arroyo team is excited to close out the year with the release of Arroyo 0.13! This is our fifth release of the year, and caps an incredible 12 months for the project and community. New features include: ?? Source metadata support ?? RabbitMQ streams connector ? Atomic updating outputs ?? IAM auth for Kafka ????Operator chaining We are especially thrilled that this release includes work from four new contributors to the project. Huge thanks to everybody who contributed: * Harshit P. * Xin Hao (@haoxins) * Tiago Campos * Matt Forbes * Vipul Vaibhaw * Erle Carrara * Micah Wylde See all of the details on our blog, and try it out with $ brew install arroyosystems/tap/arroyo

Announcing Arroyo 0.13.0

arroyo.dev

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
4 个月已编辑
举报此动态
If you missed #p99conf last week, talks are now available to stream on YouTube. I spoke about the design decisions that went into Arroyo's incredible performance: https://lnkd.in/g8-rrGWR. Come for the Rust hot takes, stay for my terrible hand-drawn architecture diagrams ??

P99 CONF 2024 | Latency, Throughput & Fault Tolerance: Arroyo Streaming Engine by Micah Wylde

https://www.youtube.com/

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
5 个月已编辑
举报此动态
We've been able to build a great open source community around Arroyo, with outside contributors adding major features and improvements—even though it's a streaming SQL engine, a piece of deep infrastructure with a high barrier to entry. Building a real community is something lots of projects struggle with. How did we do it? ? Starting with a friendly community meeting place where new contributors can meet the team, ask questions, and find mentorship (for us this is Discord) ? Doing the work of creating (and tagging) issues specifically for new contributors. This takes a lot of effort! They need to be well-documented, with enough context for someone to pick up cold. ? Cleaving off a part of the codebase that's mostly disconnected, with clean integration points to the rest of the system. For us this is our connectors subproject. which contains code to connect Arroyo with other systems. We've had multiple big contributions here, including NATS and MQTT connectors. ? Providing efficient PR reviews and actively helping users get their changes merged. Nothing kills motivation like waiting 2 months for a review. This all takes work and time, but we've found it incredibly worthwhile. (And if you've ever been interested in contributing to an open source data infra project, get in touch!)

3 条评论

赞评论分享
Arroyo

889 位关注者
6 个月
举报此动态
The Arroyo team is thrilled to announce that Arroyo 0.12.0 is now available! This release introduces ?? Python UDFs ??, which allow developers to extend the engine with custom functions. Also new in this release: ?? Support for Protobuf as an ingestion format ?? Much faster JSON functions and new PG-inspired JSON syntax ?? Custom TTLs for updating state ?? AWS IRSA support along with many other improvements and fixes. This release wouldn't have been possible without all of our amazing contributors, including several new to the project: ? Xin Hao (@haoxins) ? Jayshan Raghunandan (@jr200) (new!) ? Marco Lugo (new!) ? Micah Wylde ? Tiago Campos (new!) ? ZhuLiquan (@zhuliquan) (new!) With Python support, we're excited to bring powerful stream processing to a whole new set of developers. We can't wait to see what you build! https://lnkd.in/g-dEyBqh

Announcing Arroyo 0.12.0

arroyo.dev

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
6 个月已编辑
举报此动态
Excited for the SF DataFusion meetup next Wednesday! I'll be giving a talk about how Arroyo implements dynamically-loaded UDFs. Because Rust lacks a stable ABI, this is harder than it sounds—different compiler versions or even changes to flags can break code loading. But we don't want to recompile our entire engine just to use a UDF. This gets even harder if we're trying to use async across a UDF boundary (which Arroyo has to support to enable things like HTTP calls, database lookups, and model inference in UDFs). How do we do it? You'll have to come to the meetup to find out. But I'll give you a hint: it involves C ?? See you there!

SF DataFusion meetup - September 2024 · Luma

lu.ma

5 条评论

赞评论分享
Arroyo转发了
Micah Wylde

Founder, Arroyo (YC W23)
6 个月已编辑
举报此动态
Arroyo is coming to Current 2024! Excited to see everyone in Austin next week.
3 条评论

赞评论分享

相似主页

查看职位

融资

Arroyo 共 1 轮

上一轮

种子前 2023年5月5日

US$500,000.00

投资者

Y Combinator

在 Crunchbase 上查看更多信息

登录看看您认识Arroyo的哪些人

Arroyo

软件开发

Berkeley，CA 889 位关注者

Serverless stream processing with SQL

关于我们

地点

Arroyo员工

Micah Wylde

Founder, Arroyo (YC W23)

Jackson Newhouse

Founder

动态

P99 CONF 2024 | Latency, Throughput & Fault Tolerance: Arroyo Streaming Engine by Micah Wylde

https://www.youtube.com/

立即加入，查看您错过的职场动态

相似主页

Miro Insights (formerly Cardinal)

Berry (YC W23)

ParadeDB

Finic (YC W23)

OfOne (YC W23)

Loula (YC W23)

Invopop (YC W23)

ShortLoop (YC W23)

Gluetrail (YC W23)

Lume (YC W23)

查看职位

量化开发师职位

全栈工程师职位

人力资源业务合作伙伴职位

工程师职位

销售高级副总裁职位

机器学习工程师职位

融资