登录查看更多内容

Expand Your Data Science Toolkit with Our Latest Math and Stats Must-Reads

Towards Data Science

Your home for data science. A publication sharing concepts, ideas and codes.

发布日期: 2024年4月25日

Feeling inspired to write your first TDS post? We’re always open to contributions from new authors .

The fundamental principles of math that data scientists use in their day-to-day work may have been around for centuries, but that doesn’t mean we should approach the topic as if we only learn it once and then store away our knowledge in some dusty mental attic. Practical approaches, tools, and use cases evolve all the time—and with them comes the need to stay up-to-date.

This week, we’re thrilled to share a strong lineup of recent math and stats must-reads, covering a wide range of questions and applications. From leveraging (very) small datasets to presenting linear regressions in accessible, engaging ways, we’re sure you’ll find something new and useful to explore. Let’s dive in!

N-of-1 Trials and Analyzing Your Own Fitness Data. The idea behind N-of-1 studies is that you can draw meaningful insights even when the data you’re using is based on input from a single person. It has far-reaching potential for designing individualized healthcare strategies, or, in the case of Merete Lutz ’s fascinating project, establishing meaningful connections between alcohol consumption and sleep quality.
How Reliable Are Your Time Series Forecasts, Really? Making long-term predictions is easy; making accurate long-term predictions is, well, less so. Bradley Shaw FIA recently shared a useful guide to help you determine the reliability horizon of your forecasts through the effective use of cross-validation, visualization, and statistical hypothesis testing.
Building a Math Application with LangChain Agents. Despite the major strides LLMs have made in the past couple of years, math remains an area they struggle with. In her latest hands-on tutorial, Tahreem Rasul unpacks the challenges we face when we try to make these models execute mathematical and statistical operations, and outlines a solution for building an LLM-based math app using LangChain agents, OpenAI, and Chainlit.

A Proof of the Central Limit Theorem. It’s always a joy to see an abstract concept take concrete shape and, along the way, become much more accessible and intuitive for learners. That’s precisely what Sachin Date accomplishes in his latest deep dive, which shows us the inner workings of the central limit theorem, “one of the most far-reaching and delightful theorems in statistical science,” through the example of… candy!
8 Plots for Explaining Linear Regression to a Layman. Even if you, a professional data scientist or ML engineer, fully grasp the implications of your statistical analyses, chances are many of your colleagues and other stakeholders won’t. This is where strong visualizations can make a major difference, as Conor O'Sullivan demonstrates with eight different residual, weight, effect, and SHAP plots that explain linear regression models effectively.