Deep learning predicts ABC's Bachelorette/Bachelor #TheBachelorette
This deep network did really well for the Bachelor, how well will it do for the Bachelorette, TBD

Deep learning predicts ABC's Bachelorette/Bachelor #TheBachelorette

Why did we do it? Here’s the story and what it means for the field of AI.

Flying to LA a few months ago, waiting for the internet to become available on the flight, I wondered what our super computer was working on. I’m always wondering that BTW. Oh, that’s right, it is using all available resources to predict the rank of ABC’s the bachelor. Keep in mind our computer is more powerful than ~100,000 mac laptops, all of that power directed towards one goal. It could be curing cancer, but instead it is trying to figure out which human will be picked by the Bachelor from a single photo.

During the past year we have solved some of the largest data sets on the market for companies ranging from entertainment to finance. Moving beyond millions to 100s of millions of images and audio files. Something that will surprise you, is that teaching the computer to predict the season rank of the ABC’s Bachelor/Bachelorette from a few hundred images is very difficult. It is one of the hardest problems I have worked on. It requires more computational resources than some of our largest problems. It has also required us to invent new super-nerd technologies never seen before in deep learning liked forced-evolution and continuous-fractional-max-pooling.

The Computer’s Math Problem: Reduce The Broken Hearts

Can the computer take a single image of a contestant and predict the final rank when they exit the show. A rank of 0.1 means they have won and have been selected as the final contestant, a rank of 1.0 means they have been voted off at the beginning. The vanity metric we are chasing is how well we can rank a new season sight-unseen. It is easy for a computer to memorize, but can it truly predict the future. In a way the computer is minimizing the broken heart theory (The broken heart theory hypothesizes that the higher the r-value the fewer hearts will be broken).

Predict A New Season? I Don’t Believe You.

Good, that is the right attitude to have with any AI problem. Distrust the results until they can be proven otherwise. To test this we will be training the Bachelor on seasons 11–21 and testing on season 22, again just by looking at only faces. Then for the Bachelorette we will be training on season 3–11 and testing on season 12. 

Doing this we get…. drum roll…. really we shouldn’t be able to get any signal on this.

ABC’s The Bachelor Season 22:

Wow! So not only did we predict the actual winner, Lauren Burnham, we predicted the mistaken winner, Becca Kufrin, as our third pick. Also, for reference an r-value of 0.45 is outstanding for anyone considering talent assessment. The fact that there is signal here is actually kind of terrible, what does that mean that the computer can pick the winner from a face. It looks like we could push this model to production and save ABC tens-of-millions in production costs and more than 28 broken hearts. 

Bachelorette Season 12:

Building a model on seasons 3–12 we get this predicted rank from the face. Still solid performance on the ranking with an r-value of 0.41.

Remember an r-value of 0.41 is better than most recruiting algorithms, but it had a major miss on the winner pick. So this could go into production and reduce broken hearts by 40%. The computer was pissed when Robert Hayes wasn’t chosen in the final 2, huge mistake humans…. "I’ll remember this for eternity" said the computer. 

Bachelorette Season 14:

For season 14 the computer is predicting the following scores/rank, lower is better. It is disappointing to have lost 2 of our the top 3 picks out of the gate, as the season progresses we will calculate the r-value. The results of this model exclude season 13 because it was considered an outlier compared to the previous seasons. 

(0.0, 'chris.jpg'),
(1.6898692656543441, 'christian_banker.jpg'), OUT
(1.949007652946111, 'grant.jpg'),             OUT
(2.0675702572658863, 'john.jpg'),
(2.219903117608943, 'colton.jpg'),
(2.586861784549065, 'nick.jpg'),
(2.737004827353012, 'jordan.jpg'),
(2.918356119305923, 'lincoln.jpg'),
(3.1161169835343987, 'jason.jpg'),
(3.3780397105781614, 'jean_blanc'),
(3.705401677831895, 'garrett.jpg'),
(3.7131527551429655, 'willis.jpg'),
(3.922223827481912, 'joe.jpg'),               OUT
(4.051341002221742, 'conner.jpg'),
(4.282078766042514, 'jake.jpg'),              OUT
(4.336184094233388, 'ryan.jpg'),
(4.752649027464133, 'david.jpg'),
(5.381722295297683, 'kamil.jpg'),             OUT
(5.604681041718836, 'chase.jpg'),             OUT
(6.771207791806923, 'clay.jpg'),
(6.889416686033743, 'leo.jpg'),
(6.89900375314938, 'blake.jpg'),
(8.710523197244902, 'trent.jpg'),
(9.21238635412014, 'alex.jpg'),
(9.282190864396906, 'mike.jpg'),
(9.737171920861707, 'darius.jpg'),            OUT
(10.114542792299895, 'rickey.jpg'),
(12.0, 'christian_globetrotter.jpg')

If we consider season 13 in the training we get this below, which doesn't look very promising, but we can see how it plays out:

(0.0, 'david_Q.jpg'),
(0.7200031599499398, 'joe_Q.jpg'),       OUT
(1.1584597060012036, 'nick_Q.jpg'),
(1.2635562409659975, 'jason_Q.jpg'),
(1.5374698352642293, 'jake_Q.jpg'),      OUT
(1.585876892057379, 'alex_Q.jpg'),
(2.2618564201746443, 'john_Q.jpg'),
(2.536993842557146, 'conner_Q.jpg'),
(2.6265013545366225, 'willis_Q.jpg'),
(2.8298728573317673, 'chase_Q.jpg'),     OUT
(2.8548392598726977, 'ryan_Q.jpg'),
(2.8698721541489722, 'kamil_Q.jpg'),     OUT
(3.3015493381054464, 'grant_Q.jpg'),     OUT
(3.8703777667871933, 'jordan_Q.jpg'),
(3.9344346178359757, 'chris_Q.jpg'),
(4.3809739157130165, 'lincoln_Q.jpg'),
(4.741988300710415, 'christian_banker'), OUT
(4.944700271916133, 'rickey_Q.jpg'),
(5.0101952421042855, 'garrett_Q.jpg'),
(5.306225018946435, 'jean_blanc'),
(5.320599554496339, 'trent_Q.jpg'),
(5.505588796304663, 'colton_Q.jpg'),
(5.820118468808472, 'darius_Q.jpg'),
(6.17160202524656, 'blake_Q.jpg'),
(6.6826721777503035, 'clay_Q.jpg'),
(8.613534013210636, 'christian_globetrotter'),
(9.597175870352437, 'leo_Q.jpg'),
(12.0, 'mike_Q.jpg')

We will have to see how they perform. I'm hoping they don't predict anything, if that is the case it means what you say/do/act matters for selection. Otherwise, we are just a bunch of animals.

So why did we do it? Hopefully, we showed you an interesting use of AI and you comment/share/like in return. This dataset, though useless, motivated Ziff’s internal deep-learning teams to invent new technologies for our customer base. This dataset should not be as predictive as we have shown it to be for season 22 and 12. 

What is CFP (Continuous-fractional-pooling)?

Deep neural networks have a problem with being too aggressive with the image downsampling. A researcher named Benjamin Graham demonstrated an approach he called fractional max pooling several years ago. It was considered fringe research and not adopted by the AI giants. There were problems with the approach preventing it from prime-time in production. This dataset above demonstrates a production level of fractional pooling which is not subject to long training/inference times. Fractional max pooling allows for the network to account for all of the resolutions of your image and achieve best in class accuracies. 

#TheBachelorette Please like/comment/share if you like this and what to see more unusual AI content like this.

Mark Rasmussen

DMTS Cloud and AI Architect at Micron Technology

6 年

So my wife is watching this right now... down to the last couple so I thought I would review your results... hmmm... we are not yet to the singularity! :)

回复
Thang Duong

Engineer and Scientist

6 年

Would be interesting to get NDCG metric for the ranking

回复
Eser Sekercioglu

Intelligence Manager @meta | ex-Pinterest | Data Analyst | Irish Citizen

6 年

Hi Ben, this is really interesting (and a little depressing :). I wonder how much of of the signal is due to the human mind's preference for lateral symmetry. I think it would be great to compare your learning model to a much simpler symmetry based model. Any thoughts on this?

回复
Babak Ghalebi

Data Scientist at Meta

6 年

Thanks for sharing this with us, Ben! What surprises me the most, obviously, isn’t the fact that there’s signal, it’s the fact that you were able to find it using only 300 images! This is mind blowing!! Is there a more detailed/technical blog post coming!? (Please say yes!)

Karthik Bangalore Mani

Machine Learning @ Scale @ Oracle

6 年

Hi Ben. Is Continuous Fractional pooling different from Fractional pooling? If yes, could you please explain the difference? Thanks !

要查看或添加评论,请登录

??Jepson Taylor的更多文章

  • TROLL Voices 2020: Data Science & AI

    TROLL Voices 2020: Data Science & AI

    I am debuting the 1st annual TROLL Voices list, a collection of trolls. As COVID-19 continues to upend our lives, these…

    35 条评论
  • Bump in the night? AI can help

    Bump in the night? AI can help

    This story starts with a scary clown. Not pennywise, a real one.

    8 条评论
  • Let's talk imposter syndrome

    Let's talk imposter syndrome

    Talking to people who are wanting to break into the data science space many have opened up about their concerns and…

    44 条评论
  • How to recognize AI snake oil... in academics

    How to recognize AI snake oil... in academics

    Here is my 10-minute digest of a piece making the rounds online: "How to recognize AI snake oil":…

    42 条评论
  • Millions dead thanks to HIPAA privacy

    Millions dead thanks to HIPAA privacy

    I upset people with my data rants sometimes. Several weeks ago I made the point online that #HIPAA kills people.

    74 条评论
  • The AI Shitshow: Hype To Reality #69

    The AI Shitshow: Hype To Reality #69

    The AI hype wave has been followed by disappointment. Hire a data science team! Buy some GPUs! Get ready for AI! You…

    90 条评论
  • AI Replaces Appraisers

    AI Replaces Appraisers

    All the data that matters: What data actually matters for appraising a property? The number of bedrooms? The number of…

    6 条评论
  • AI Thinks Men Are Shallow

    AI Thinks Men Are Shallow

    The data doesn't lie. We started noticing this several years ago, first with some of the attraction data that was…

    21 条评论
  • Fiction Today Is Reality Tomorrow

    Fiction Today Is Reality Tomorrow

    My concept of reality has always had boundaries. Moving outside of those boundaries in the past has been classified as…

    16 条评论
  • DeepXmas: AI knows if you are naughty or?nice

    DeepXmas: AI knows if you are naughty or?nice

    Who did this! I say as I look at the eggs, baking soda, flour, and real-lemon juice on the kitchen floor. The recipe…

    12 条评论

社区洞察

其他会员也浏览了