Arroyo的封面图片
Arroyo

Arroyo

软件开发

Berkeley,CA 889 位关注者

Serverless stream processing with SQL

关于我们

Arroyo is bringing real-time data to every company with the Arroyo Streaming Engine

网站
https://www.arroyo.dev
所属行业
软件开发
规模
2-10 人
总部
Berkeley,CA
类型
私人持股
创立
2022

地点

Arroyo员工

动态

  • 查看Arroyo的组织主页

    889 位关注者

    Arroyo 0.14.0 is now available, with some great new features, improvements, and fixes, including: ?? Lookup joins ?? Nested updating aggregates {} Struct types ?? Streaming SQL syntax ?? Sink shuffles Thanks to all of our contributors for this release and especially Ratul D. and Nathan Lapierre who had there first contributions in this release! See the full release notes on the Arroyo blog: https://lnkd.in/dyFtxtf4

  • Arroyo转发了

    查看Micah Wylde的档案

    Founder, Arroyo (YC W23)

    The sad, dark truth of the data world is that—for all of our fancy algorithms and systems and careful performance engineering, a majority of CPU time might just go to...decoding JSON. In some other slice of the multiverse data teams have all moved to efficient formats like Avro and Protobuf, but our fallen world still runs on a sort-of-specified data serialization format extracted from a frontend programming language, itself famously created in 10 days. So if we're going to have to read JSON, we might as well do it quickly. And if you've ever been curious how we do this at Arroyo, have I got the incredibly long and in-depth explanation for you!

  • Arroyo转发了

    查看Micah Wylde的档案

    Founder, Arroyo (YC W23)

    As Arroyo has grown, our internal analytics needs have outpaced our initial, ad-hoc data infra. When it came time to rebuild it, we turned to the best technologies of the modern data stack: an object-storage based data lake queried by DuckDB. And of course, Arroyo itself to provide near-real-time ingestion. We're so happy with how this turned out, I thought it would be worth documenting for other folks looking to build an easy, cheap, near-real-time analytics system. We're calling this approach the LOAD stack, for log storage/object storage/Arroyo/DuckDB. In our deployment, we combine several managed and open-source tools to provide sub-minute access to data at a small fraction of the cost of fully-managed solutions like Databricks or Snowflake: ? AWS Lambda to get events in ? Redpanda Data Serverless to store them for processing ? Arroyo for efficient and fault-tolerant ingestion ? S3 for long-term storage ? DuckDB via AWS SageMaker Notebook for analysis At Lyft, it took an entire team two+ years to build out a comparable stack; with modern tooling, I think an individual can build something comparable in a few hours today. Find the full walkthrough in our writeup:

  • 查看Arroyo的组织主页

    889 位关注者

    The Arroyo team is excited to close out the year with the release of Arroyo 0.13! This is our fifth release of the year, and caps an incredible 12 months for the project and community. New features include: ?? Source metadata support ?? RabbitMQ streams connector ? Atomic updating outputs ?? IAM auth for Kafka ????Operator chaining We are especially thrilled that this release includes work from four new contributors to the project. Huge thanks to everybody who contributed: * Harshit P. * Xin Hao (@haoxins) * Tiago Campos * Matt Forbes * Vipul Vaibhaw * Erle Carrara * Micah Wylde See all of the details on our blog, and try it out with $ brew install arroyosystems/tap/arroyo

  • Arroyo转发了

    查看Micah Wylde的档案

    Founder, Arroyo (YC W23)

    If you missed #p99conf last week, talks are now available to stream on YouTube. I spoke about the design decisions that went into Arroyo's incredible performance: https://lnkd.in/g8-rrGWR. Come for the Rust hot takes, stay for my terrible hand-drawn architecture diagrams ??

  • Arroyo转发了

    查看Micah Wylde的档案

    Founder, Arroyo (YC W23)

    We've been able to build a great open source community around Arroyo, with outside contributors adding major features and improvements—even though it's a streaming SQL engine, a piece of deep infrastructure with a high barrier to entry. Building a real community is something lots of projects struggle with. How did we do it? ? Starting with a friendly community meeting place where new contributors can meet the team, ask questions, and find mentorship (for us this is Discord) ? Doing the work of creating (and tagging) issues specifically for new contributors. This takes a lot of effort! They need to be well-documented, with enough context for someone to pick up cold. ? Cleaving off a part of the codebase that's mostly disconnected, with clean integration points to the rest of the system. For us this is our connectors subproject. which contains code to connect Arroyo with other systems. We've had multiple big contributions here, including NATS and MQTT connectors. ? Providing efficient PR reviews and actively helping users get their changes merged. Nothing kills motivation like waiting 2 months for a review. This all takes work and time, but we've found it incredibly worthwhile. (And if you've ever been interested in contributing to an open source data infra project, get in touch!)

  • 查看Arroyo的组织主页

    889 位关注者

    The Arroyo team is thrilled to announce that Arroyo 0.12.0 is now available! This release introduces ?? Python UDFs ??, which allow developers to extend the engine with custom functions. Also new in this release: ?? Support for Protobuf as an ingestion format ?? Much faster JSON functions and new PG-inspired JSON syntax ?? Custom TTLs for updating state ?? AWS IRSA support along with many other improvements and fixes. This release wouldn't have been possible without all of our amazing contributors, including several new to the project: ? Xin Hao (@haoxins) ? Jayshan Raghunandan (@jr200) (new!) ? Marco Lugo (new!) ? Micah Wylde ? Tiago Campos (new!) ? ZhuLiquan (@zhuliquan) (new!) With Python support, we're excited to bring powerful stream processing to a whole new set of developers. We can't wait to see what you build! https://lnkd.in/g-dEyBqh

  • Arroyo转发了

    查看Micah Wylde的档案

    Founder, Arroyo (YC W23)

    Excited for the SF DataFusion meetup next Wednesday! I'll be giving a talk about how Arroyo implements dynamically-loaded UDFs. Because Rust lacks a stable ABI, this is harder than it sounds—different compiler versions or even changes to flags can break code loading. But we don't want to recompile our entire engine just to use a UDF. This gets even harder if we're trying to use async across a UDF boundary (which Arroyo has to support to enable things like HTTP calls, database lookups, and model inference in UDFs). How do we do it? You'll have to come to the meetup to find out. But I'll give you a hint: it involves C ?? See you there!

相似主页

查看职位

融资

Arroyo 共 1 轮

上一轮

种子前

US$500,000.00

投资者

Y Combinator
Crunchbase 上查看更多信息