LETSQL

软件开发

New York，New York 96 位关注者

Build portable, multi-engine data pipelines for ML

关注

查看全部 5 位员工

关于我们

LETSQL is a data processing library with a portable, multi-engine runtime. It is focused on composability with first-class support for UDFs. It enables declarative and performant multi-engine pipelines in Python ????. We harness the expressiveness of Python to empower data professionals across platforms. Core Values: 1. Movement: We believe movement breaks barriers, both in people and technology. We are always moving forward, learning, and growing. 2. Rigor: Our dedication to thoroughness and precision ensures that every task is executed with utmost care and excellence. 3. Responsibility: We hold ourselves accountable for the impact of our actions on our community, our environment, and our stakeholders. 4. Curiosity: Our drive to question, explore, and learn fuels our creativity and leads to groundbreaking solutions. 5. Compassion:In all our interactions, we approach with empathy, understanding, and a deep desire to positively impact the lives of others. 6. Transparency: We believe in open communication and clarity in our processes, fostering trust and integrity in all our relationships.

网站: https://www.letsql.com
LETSQL的外部链接
所属行业: 软件开发
规模: 2-10 人
总部: New York，New York
类型: 私人持股
创立: 2024
领域: Machine Learning、SQL、Preprocessing和Rust

地点

主要

169 Madison Ave

STE 11445

US，New York，New York，10016

获取路线

LETSQL员工

Dan Lovell

ML Pipeline Carpenter
Hussain S.

Building data pipelines
Morris Clay

Partner at Lunar. Backing data, AI, and cloud infra founders at the earliest stages.
Daniel Mesejo

Software Engineer at LETSQL

查看全部员工

动态

LETSQL转发了

Hussain S.

Building data pipelines
2 个月
举报此动态
Ever wondered how to turbocharge your image processing workflows? We've cracked the code, and the results are awesome. In our post, we reveal: ? How to integrate cutting-edge AI (like Meta's SAM) with DataFusion, an unbundled, Arrow-native database. ? A secret weapon that made our image segmentation 4x faster using Hugging Face's Candle library ?? ? A novel approach using LETSQL and Rust-based ML ecosystem Sneak peek: We're using LETSQL and Rust to revolutionize how we handle unstructured data. It's a game-changer for healthcare, manufacturing, and more! Ready to give it a try? Simply drop into an interactive iPython shell by running: ???nix run github:letsql/letsql or ?? pip install letsql Check out the full post here: https://lnkd.in/egaeijGa Drop a ?? if you're ready to supercharge your data pipelines! #MachineLearning #DataEngineering #DataFusion #Rust

LETSQL - Making Deep Learning Workflows, Relational

letsql.com

2 条评论

赞评论分享
LETSQL转发了

Hussain S.

Building data pipelines
3 个月
举报此动态
Caching++??? I'm excited to share a new caching feature that we've been working on in letsql. This feature allows you to cache the results of compute expressions (`ibis.expr`), whether in the upstream engine or on local disk, making your iterations faster and more efficient. ?? Why does this matter? ?? As data scientists, we often face the challenge of having to repeatedly pull data from its source or re-compute results during the iterative process of building machine learning models. This not only slows down our workflow but also puts unnecessary strain on source systems and network bandwidth. Our new caching feature addresses this by providing a seamless way to cache data, significantly speeding up your iteration cycles while reducing cognitive load. No more manual processes, no more disjointed workflows—just pure efficiency. How does it work? ?? With LETSQL, you can easily cache sub-expressions representing upstream queries i.e. `ibis.expr`. The data is then stored either in the upstream engine or locally, depending on your needs. The best part? This feature is designed to work with multiple engines, giving you the flexibility to choose the best storage option for your workflow. ?? Ready to give it a try? Simply drop into an interactive iPython shell by running: ???nix run github:letsql/letsql We’re still in the beta phase, and your feedback is incredibly valuable as we refine this feature and work towards a stable API. Let us know what you think! ?? Learn more about the feature in our latest blog post: https://lnkd.in/eGpxwUaU #DataScience #MachineLearning #Caching #DataFrames

LETSQL - Caching++ for DataFrames

letsql.com

1 条评论

赞评论分享
LETSQL

96 位关注者
4 个月
举报此动态
?? We did it! ?? We're thrilled that our GitHub repository, letsql has reached its first 10 stars! ???????????????????? A huge thank you to our amazing community for the support and encouragement. This milestone is just the beginning, and we couldn't have done it without you. Stay tuned for more updates and features as we continue to grow and improve. If you haven't already, check out our repo and join us on this exciting journey! ?? https://lnkd.in/eeQkmAYa #GitHub #OpenSource #Milestone #LETSQL
6 条评论

赞评论分享
LETSQL

96 位关注者
4 个月已编辑
举报此动态
??? Harlequin DataFusion ??? Adapter! Hey, ya'all! We’ve just rolled out the Harlequin DataFusion Adapter, and put together a tutorial on how you can build one for your backend using Poetry. ?? Here’s What’s New: Smooth Integration: Harlequin now works seamlessly with Apache DataFusion, boosting flexibility and performance. ???? Who Should Check This Out? Data scientists, engineers, and anyone dealing with data—this one’s for you! ?? ?? Why It Matters: Harlequin is a TUI IDE for SQL. It's simple and looks beautiful. ??? Tech Details: Get into the nitty-gritty of how to set it up and what makes it a crucial addition to your data toolkit ?? Read More: https://lnkd.in/efACgUSi ?? Join the Conversation: Drop a comment with your thoughts or how you handle your data workflows. Let’s share ideas and solutions! ?? ?? Take Action: Are you ready to upgrade your data transformations? Head over to LETSQL and stay tuned for more updates. Like, share, and follow for the latest news! #DataScience #DataFusion #Harlequin #DataEngineering #AI

LETSQL - Creating Harlequin Adapter for Apache DataFusion

letsql.com

1 条评论

赞评论分享
LETSQL

96 位关注者
5 个月
举报此动态
??,??, ?? x ?? = ??????

Hussain S.

Building data pipelines
6 个月

Ever grappled with juggling multiple data processing engines, i.e.,??,??, ?? etc. etc.? Meet Ibis, the hidden-gem of PyData that's about to revolutionize your data workflow. This post unpacks how Ibis seamlessly integrates with various backends and presents a concept where a single expression can run across multiple- engines. The primary benefits are: 1?? Segment computation, by data size or cost, in-situ or serverless 2???Apply database style optimizations to pipelines with ML or complex operators 3?? Right-size infrastructure, based on consumption stages ?? Deep dive into the full post here: https://lnkd.in/ehNqCTwB

Declarative Multi-Engine Data Stack with Ibis

letsql.com

赞评论分享
LETSQL

96 位关注者
8 个月
举报此动态
Pushing Down ML in SQL Engines - A Exploration! ?? In our latest installment of the LETSQL exploration series, Daniel Mesejo , our founding engineer, dives deep into the fascinating intersection of Machine Learning and SQL, presenting a research approach to enhancing XGBoost model inference through cross-domain optimization in SQL. ?? What's Inside: - ???An in-depth demonstration of leveraging SQL’s relational machinery for optimizing an end-to-end ML inference pipeline. - ?? Insightful examples on compiling XGBoost models into SQL, unlocking unprecedented database-style optimizations such as predicate pushdowns, projection pushdowns, and constant folding. - ??A captivating case study using Microsoft’s Length of Stay dataset to demonstrate the significant UX and performance improvements achievable. ?? Key Takeaways: - Discover how to transform a simple XGBoost model into a powerful SQL query. - Understand the impact of database-style optimizations on ML inference pipelines. - Explore the potential of SQL and ML to work hand in hand, paving the way for innovative data science practices. ?? Read the full article here: https://lnkd.in/eA-p6Ei3

Pushing down ML in SQL Engines: A Exploration

letsql.com

1 条评论

赞评论分享
LETSQL

96 位关注者
9 个月
举报此动态
?? User Defined Aggregate Function (UDAF) Processing! ?? UDFs and UDAFs are a great way to leverage the optimization and database machinery as a performant general purpose computation platform. Dive into our latest blog where we tackle two major challenges in UDAF processing: 1?? Generic Building & Registering UDFs: Learn how to seamlessly integrate UDFs across engines like Pandas and DataFusion. 2?? Optimization Through UDF/UDAF Internals: Opportunity to make UDFs and UDAFs less black box that can lead to relational plan optimizations. Our insights are not just theoretical; we provide practical solutions and examples. Whether you're a data scientist or engineer, this guide is a must-read to elevate your data analysis strategies. ?? Read the full post here: https://lnkd.in/eN-dkfpT ?? Access our GitHub for more resources: https://lnkd.in/echN2Txz Join us in pushing the boundaries of data processing! #Python #pandas #datafusion

Using DataFusion’s UDAFs to do ML Training: A LETSQL Exploration

letsql.com

赞评论分享
LETSQL

96 位关注者
10 个月
举报此动态
???? Exciting Post From LETSQL! ???? ?? New Blog Alert: Discover our latest article on crafting an XGBoost scoring pipeline using DataFusion, complete with a GitHub repository. ????? Cutting-Edge Techniques: We're pushing boundaries by integrating UDF machinery in DataFusion for One-Hot preprocessing and XGBoost predictions. Plus, it's all SQL-friendly for the inference pipeline, making ML accessible to broader analysts community. ????? Micro-Benchmarks: Witness a 100x acceleration in preprocessing, outpacing traditional pandas methods. ?? LETSQL for ML: XGBoost Scoring Pipeline with DataFusion: https://lnkd.in/eXQ7DRfp #MachineLearning #DataScience #XGBoost #DataFusion #SQL

赞评论分享

相似主页

有意向到LETSQL工作吗？

LETSQL

软件开发

New York，New York 96 位关注者

Build portable, multi-engine data pipelines for ML

关于我们

地点

LETSQL员工

Dan Lovell

ML Pipeline Carpenter

Hussain S.

Building data pipelines

Morris Clay

Partner at Lunar. Backing data, AI, and cloud infra founders at the earliest stages.

Daniel Mesejo

Software Engineer at LETSQL

动态

LETSQL - Making Deep Learning Workflows, Relational

letsql.com

LETSQL - Caching++ for DataFrames

letsql.com

LETSQL - Creating Harlequin Adapter for Apache DataFusion

letsql.com

Declarative Multi-Engine Data Stack with Ibis

letsql.com

Pushing down ML in SQL Engines: A Exploration

letsql.com

Using DataFusion’s UDAFs to do ML Training: A LETSQL Exploration

letsql.com

立即加入，查看您错过的职场动态

相似主页

Voltron Data

Composed Ventures

Vespa.ai

Bruin

Posit PBC

ClickHouse

Confluent

Wharf Labs

Full Spectrum Analytics

Klari