LETSQL

LETSQL

软件开发

New York,New York 96 位关注者

Build portable, multi-engine data pipelines for ML

关于我们

LETSQL is a data processing library with a portable, multi-engine runtime. It is focused on composability with first-class support for UDFs. It enables declarative and performant multi-engine pipelines in Python ????. We harness the expressiveness of Python to empower data professionals across platforms. Core Values: 1. Movement: We believe movement breaks barriers, both in people and technology. We are always moving forward, learning, and growing. 2. Rigor: Our dedication to thoroughness and precision ensures that every task is executed with utmost care and excellence. 3. Responsibility: We hold ourselves accountable for the impact of our actions on our community, our environment, and our stakeholders. 4. Curiosity: Our drive to question, explore, and learn fuels our creativity and leads to groundbreaking solutions. 5. Compassion:In all our interactions, we approach with empathy, understanding, and a deep desire to positively impact the lives of others. 6. Transparency: We believe in open communication and clarity in our processes, fostering trust and integrity in all our relationships.

网站
https://www.letsql.com
所属行业
软件开发
规模
2-10 人
总部
New York,New York
类型
私人持股
创立
2024
领域
Machine Learning、SQL、Preprocessing和Rust

地点

  • 主要

    169 Madison Ave

    STE 11445

    US,New York,New York,10016

    获取路线

LETSQL员工

动态

  • LETSQL转发了

    查看Hussain S.的档案,图片

    Building data pipelines

    Ever wondered how to turbocharge your image processing workflows? We've cracked the code, and the results are awesome. In our post, we reveal: ? How to integrate cutting-edge AI (like Meta's SAM) with DataFusion, an unbundled, Arrow-native database. ? A secret weapon that made our image segmentation 4x faster using Hugging Face's Candle library ?? ? A novel approach using LETSQL and Rust-based ML ecosystem Sneak peek: We're using LETSQL and Rust to revolutionize how we handle unstructured data. It's a game-changer for healthcare, manufacturing, and more! Ready to give it a try? Simply drop into an interactive iPython shell by running: ???nix run github:letsql/letsql or ?? pip install letsql Check out the full post here: https://lnkd.in/egaeijGa Drop a ?? if you're ready to supercharge your data pipelines! #MachineLearning #DataEngineering #DataFusion #Rust

    LETSQL - Making Deep Learning Workflows, Relational

    LETSQL - Making Deep Learning Workflows, Relational

    letsql.com

  • LETSQL转发了

    查看Hussain S.的档案,图片

    Building data pipelines

    Caching++??? I'm excited to share a new caching feature that we've been working on in letsql. This feature allows you to cache the results of compute expressions (`ibis.expr`), whether in the upstream engine or on local disk, making your iterations faster and more efficient. ?? Why does this matter? ?? As data scientists, we often face the challenge of having to repeatedly pull data from its source or re-compute results during the iterative process of building machine learning models. This not only slows down our workflow but also puts unnecessary strain on source systems and network bandwidth. Our new caching feature addresses this by providing a seamless way to cache data, significantly speeding up your iteration cycles while reducing cognitive load. No more manual processes, no more disjointed workflows—just pure efficiency. How does it work? ?? With LETSQL, you can easily cache sub-expressions representing upstream queries i.e. `ibis.expr`. The data is then stored either in the upstream engine or locally, depending on your needs. The best part? This feature is designed to work with multiple engines, giving you the flexibility to choose the best storage option for your workflow. ?? Ready to give it a try? Simply drop into an interactive iPython shell by running: ???nix run github:letsql/letsql We’re still in the beta phase, and your feedback is incredibly valuable as we refine this feature and work towards a stable API. Let us know what you think! ?? Learn more about the feature in our latest blog post: https://lnkd.in/eGpxwUaU #DataScience #MachineLearning #Caching #DataFrames

    LETSQL - Caching++ for DataFrames

    LETSQL - Caching++ for DataFrames

    letsql.com

  • 查看LETSQL的公司主页,图片

    96 位关注者

    ?? We did it! ?? We're thrilled that our GitHub repository, letsql has reached its first 10 stars! ???????????????????? A huge thank you to our amazing community for the support and encouragement. This milestone is just the beginning, and we couldn't have done it without you. Stay tuned for more updates and features as we continue to grow and improve. If you haven't already, check out our repo and join us on this exciting journey! ?? https://lnkd.in/eeQkmAYa #GitHub #OpenSource #Milestone #LETSQL

    • 该图片无替代文字
  • 查看LETSQL的公司主页,图片

    96 位关注者

    ??? Harlequin DataFusion ??? Adapter! Hey, ya'all! We’ve just rolled out the Harlequin DataFusion Adapter, and put together a tutorial on how you can build one for your backend using Poetry. ?? Here’s What’s New: Smooth Integration: Harlequin now works seamlessly with Apache DataFusion, boosting flexibility and performance. ???? Who Should Check This Out? Data scientists, engineers, and anyone dealing with data—this one’s for you! ?? ?? Why It Matters: Harlequin is a TUI IDE for SQL. It's simple and looks beautiful. ??? Tech Details: Get into the nitty-gritty of how to set it up and what makes it a crucial addition to your data toolkit ?? Read More: https://lnkd.in/efACgUSi ?? Join the Conversation: Drop a comment with your thoughts or how you handle your data workflows. Let’s share ideas and solutions! ?? ?? Take Action: Are you ready to upgrade your data transformations? Head over to LETSQL and stay tuned for more updates. Like, share, and follow for the latest news! #DataScience #DataFusion #Harlequin #DataEngineering #AI

    LETSQL - Creating Harlequin Adapter for Apache DataFusion

    LETSQL - Creating Harlequin Adapter for Apache DataFusion

    letsql.com

  • 查看LETSQL的公司主页,图片

    96 位关注者

    ??,??, ?? x ?? = ??????

    查看Hussain S.的档案,图片

    Building data pipelines

    Ever grappled with juggling multiple data processing engines, i.e.,??,??, ?? etc. etc.? Meet Ibis, the hidden-gem of PyData that's about to revolutionize your data workflow. This post unpacks how Ibis seamlessly integrates with various backends and presents a concept where a single expression can run across multiple- engines. The primary benefits are: 1?? Segment computation, by data size or cost, in-situ or serverless 2???Apply database style optimizations to pipelines with ML or complex operators 3?? Right-size infrastructure, based on consumption stages ?? Deep dive into the full post here: https://lnkd.in/ehNqCTwB

    Declarative Multi-Engine Data Stack with Ibis

    Declarative Multi-Engine Data Stack with Ibis

    letsql.com

  • 查看LETSQL的公司主页,图片

    96 位关注者

    Pushing Down ML in SQL Engines - A Exploration! ?? In our latest installment of the LETSQL exploration series, Daniel Mesejo , our founding engineer, dives deep into the fascinating intersection of Machine Learning and SQL, presenting a research approach to enhancing XGBoost model inference through cross-domain optimization in SQL. ?? What's Inside: - ???An in-depth demonstration of leveraging SQL’s relational machinery for optimizing an end-to-end ML inference pipeline. - ?? Insightful examples on compiling XGBoost models into SQL, unlocking unprecedented database-style optimizations such as predicate pushdowns, projection pushdowns, and constant folding. - ??A captivating case study using Microsoft’s Length of Stay dataset to demonstrate the significant UX and performance improvements achievable. ?? Key Takeaways: - Discover how to transform a simple XGBoost model into a powerful SQL query. - Understand the impact of database-style optimizations on ML inference pipelines. - Explore the potential of SQL and ML to work hand in hand, paving the way for innovative data science practices. ?? Read the full article here: https://lnkd.in/eA-p6Ei3

    Pushing down ML in SQL Engines: A Exploration

    Pushing down ML in SQL Engines: A Exploration

    letsql.com

  • 查看LETSQL的公司主页,图片

    96 位关注者

    ?? User Defined Aggregate Function (UDAF) Processing! ?? UDFs and UDAFs are a great way to leverage the optimization and database machinery as a performant general purpose computation platform. Dive into our latest blog where we tackle two major challenges in UDAF processing: 1?? Generic Building & Registering UDFs: Learn how to seamlessly integrate UDFs across engines like Pandas and DataFusion. 2?? Optimization Through UDF/UDAF Internals: Opportunity to make UDFs and UDAFs less black box that can lead to relational plan optimizations. Our insights are not just theoretical; we provide practical solutions and examples. Whether you're a data scientist or engineer, this guide is a must-read to elevate your data analysis strategies. ?? Read the full post here: https://lnkd.in/eN-dkfpT ?? Access our GitHub for more resources: https://lnkd.in/echN2Txz Join us in pushing the boundaries of data processing! #Python #pandas #datafusion

    Using DataFusion’s UDAFs to do ML Training: A LETSQL Exploration

    Using DataFusion’s UDAFs to do ML Training: A LETSQL Exploration

    letsql.com

  • 查看LETSQL的公司主页,图片

    96 位关注者

    ???? Exciting Post From LETSQL! ???? ?? New Blog Alert: Discover our latest article on crafting an XGBoost scoring pipeline using DataFusion, complete with a GitHub repository. ????? Cutting-Edge Techniques: We're pushing boundaries by integrating UDF machinery in DataFusion for One-Hot preprocessing and XGBoost predictions. Plus, it's all SQL-friendly for the inference pipeline, making ML accessible to broader analysts community. ????? Micro-Benchmarks: Witness a 100x acceleration in preprocessing, outpacing traditional pandas methods. ?? LETSQL for ML: XGBoost Scoring Pipeline with DataFusion: https://lnkd.in/eXQ7DRfp #MachineLearning #DataScience #XGBoost #DataFusion #SQL

相似主页