?? I often confuse maps and poker, because you fold on the river
Photo by Andrew Neel on Unsplash

?? I often confuse maps and poker, because you fold on the river

I went viral this week and put a massive amount of effort into creating content. I would say it already paid off. We are now 400 subscribers strong. Welcome to all the new folks! Let’s dive into some machine learning awesomeness!

The Latest Fashion

  • Jeremy Howard wrote an excellent overview of how he reached the top 1 on a current kaggle?competition with notebooks and videos!
  • Want faster vector operations? Bolt?compresses real-valued vectors?and speeds up computation 10x.
  • I shared prettymaps before, and here’s an?interactive Streamlit app for easy usage.?(tag?@jesperdramsch?on Twitter if you post some!)

This is "Light to the Party". All links and extra content can be found in the full issue from last week. Want the latest in your inbox? Join 444+ other curious minds.

Hot off the Press

Did I mention this week was incredible? Let’s go!

Daliana Liu invited me on?The Data Scientist Show, where we had a 2-hour conversation about weather forecasting with machine learning, missing data, my PhD experience and mental health, learnings from Kaggle, and so much more! (also?audio only)

I posted the most viral thing I have ever produced on Linkedin, which over 200,000 people saw in a day. You saw prettymaps?eight months ago in this newsletter, and Linkedin?loved it. Twitter was?lukewarm about it. I know many of you have made it over here from that, so welcome!

More blog posts! I wrote three pieces to promote books for data professionals. Here’s my post about?SQL books?to try. Then I made a blog post about?AI books?to learn. And I also posted an article about?R books.

Finally, I published a Youtube Short about?improving machine learning models?using data augmentation (also on?Tiktok?and?Instagram).

Machine Learning Insights

Last week I asked you, “What is target encoding, and what do you have to void applying it?” here’s my answer:

Target encoding is an encoding method for categorical features in data.

Frequently we encode data with ordinal encoding, simply assigning a unique number to each category. Alternatively, one-hot encoding assigns a separate column for each category in a feature and simply assigns a binary value. These encodings don’t work well with high cardinality.

Target encoding attempts to encode input data according to its effect on the target variable (hence its name). There are different ways to implement target encoding, especially since it can be a way for the target to leak into the training data. The simplest way is to count the number of samples that respond positively to a sample and divide it by the total number of samples within that category.

The problems with target encoding are two-fold. The aforementioned target leakage can wrongfully boost accuracy. Additionally, it’s possible that the data we collected doesn’t fully represent the actual population in that category, assigning an incorrect target encoding in the process.

Learn more about it in this?neat blog post.

This is "Light to the Party". All links and extra content can be found in the full issue from last week. Want the latest in your inbox? Join 444+ other curious minds.

Question of the Week

  • What is the difference between generative and discriminative models?

Post them on Twitter and Tag me. I’d love to see what you come up with. Then I can include them in the next issue!

Tidbits from the Web

  • The world’s?most hated art style?(and why the original is so much better).
  • Random numbers are complex. So how about we?generate them with a banana?
  • The second best thing to?never sending an email?again.

You just read issue #84 of Light To The Party. You can also browse the?full archives?of this newsletter.

要查看或添加评论,请登录

Dr. Jesper Dramsch的更多文章

社区洞察

其他会员也浏览了