?? At a ghosts birthday party, it’s easy to be the life of the party
Photo by Adi Goldstein on Unsplash

?? At a ghosts birthday party, it’s easy to be the life of the party

it was my birthday for this issue in July! ?? So, before I’m off to celebrate, let’s look at some awesome machine learning!

The Latest Fashion

  • Last week I shared the Dall-E 2 prompt book. Now?you can sign up?to the open Dall-E 2 beta.
  • Most practitioners know tabular data is for xgboost. This?new paper by Gael Varoquaux?et al backs this up.
  • Meta of all places launches a tool?Wikipedia is using for fact-checking. It’s so weird, I lapped up the article about it.

This is "Light to the Party". All links and extra content can be found in the?full issue from back when. Want the latest in your inbox??Join 900+ other curious minds.

My Current Obsession

Pythondeadlin.es?has been making the rounds too and was featured in?NotANumber?and shared around?on Twitter. It feels nice to have contributed a tool that seems to find such wide appeal.

[...]

I also had a small tweet about a neat pandas feature got viral. Should I expand on this?

Hot off the Press

After making Youtube Partner, I made a video with?100 machine learning tips and tricks to celebrate. I also made a?blog post?with a lot of extra information, links, and code snippets to go with the story.

Machine Learning Insights

Last week I asked, “What is the advantage of mini-batch learning?” and here’s why:

Mini-batch learning is usually used in reference to neural networks and gradient descent. Back in the day, gradient descent would refer to numerical optimization of a problem (like neural networks) or any other objective function to find a (hopefully) global minimum and, therefore, the best solution. Gradient descent would take in every data point in our equation and calculate the error and calculate the gradient on that. Following that gradient reduces the error and therefore brings us closer to the best solution. Very vague. I know.

Why is this so vague?

Gradient descent can be used on a ton of different problems, not just neural networks. It’s very popular in physics because it turns out all our differential equations are exactly that: differentiable. Great for gradients. So we have our model, our objective function, and our observations. We throw gradient descent at it, change the model, throw GD at it, optimize, etc… until we can’t get a better solution. So we can really use gradient descent for a ton of numerical optimisation problems, hence the vagueness. It’s quite universal.

[...]

This is "Light to the Party". Read the?full issue here. Want the latest in your inbox??Join 900+ other curious minds.

Data Stories

I have to be honest here. I’m not a space person. I’m not particularly fascinated by landing on the moon or the stars.

But I got the fascination with the new James Webb Space Telescope.?Hank Green?sharing the first image or?Kirsten Banks?talking about the space origami, it’s a lovely science world on Tiktok. It was neat. The technical solutions they came up with? Awesome.

But it took?Webb compare?for me to realize the step change between Hubble and the JWST. Look at how much more detailed and in-depth these images are!

Question of the Week

  • What is the Double Descent phenomenon?

Post them on Twitter and Tag me. I’d love to see what you come up with. Then I can include them in the next issue!

Tidbits from the Web

This is "Light to the Party". All links and extra content can be found in the?full issue from back when. Want the latest in your inbox??Join 900+ other curious minds.

要查看或添加评论,请登录

Dr. Jesper Dramsch的更多文章

社区洞察

其他会员也浏览了