My Three Ex’s: A Data Science Approach for Applied Machine Learning

Today, I gave a talk at QCon SF entitled “My Three Ex’s: A Data Science Approach for Applied Machine Learning”. The talk wasn’t about machine learning as such, but rather about applying machine learning to solve problems.

Hence my three ex’s:

Express: Understand your utility and inputs.

  • Choose an objective function that models utility.
  • Be careful how you define precision.
  • Account for non-uniform inputs and costs.
  • Stratified sampling is your friend.
  • Express yourself in your feature vectors.

Explain: Understand your models and metrics.

  • Accuracy isn’t everything.
  • Less is more when it comes to explainability.
  • Don’t knock linear models and decision trees!
  • Start with simple models, then upgrade.

Experiment: Optimize for the speed of learning.

  • Kiss lots of frogs: experiments are cheap.
  • But test in good faith – don’t just flip coins.
  • Optimize for the speed of learning.
  • Be disciplined: test one variable at a time.

I peppered the talk with examples from my experiences working on search quality at LinkedIn and Google:

  • Modeling search quality and searcher effort.
  • Mapping local businesses to their official home pages.
  • Segmenting search models based on searchers and queries.
  • Automatically rewriting search queries to improve relevance.
  • Entity-based search suggestions.

For those who weren’t able to hear the talk live, I hope the slides prove useful. And I’ll share the video as soon as it’s available.


Finally, I’d like to thank Brendan Collins, Gloria Lau, and Monica Rogati — three of my favorite ex-coworkers, to whom I had the pleasure to dedicate this talk. I learned so much from working with all of you, and I look forward to working with you again someday.

Gordon Rios

Founding Scientist | Venture-funded AI (We're hiring)

10 年

Great advice daniel super useful!

回复

要查看或添加评论,请登录

Daniel Tunkelang的更多文章

  • Precision, Recall, and Desirability: A Deep Dive

    Precision, Recall, and Desirability: A Deep Dive

    This post expands on my previous discussion of “Precision, Recall, and Desirability,” diving deeper into defining…

  • ChatGPT, Are You Just Telling Me What I Want to Hear?

    ChatGPT, Are You Just Telling Me What I Want to Hear?

    These days, the Turing Test — which Turing originally called the “imitation game” — feels hopelessly outdated. With…

  • Not All Recall is Created Equal

    Not All Recall is Created Equal

    Search application developers constantly navigate tradeoffs, particularly between precision and recall. Precision…

    1 条评论
  • To Bot or Not to Bot: It Depends on the Question

    To Bot or Not to Bot: It Depends on the Question

    I was one of Quora’s earliest users. I earned Top Writer status for several years and even made some money through…

  • Ground Truth: A Useful Fiction

    Ground Truth: A Useful Fiction

    A key concern about AI is that models “hallucinate” — technical jargon for saying that they make up things that look…

    5 条评论
  • Conjunction, Disjunction, What’s Your Function?

    Conjunction, Disjunction, What’s Your Function?

    Like many folks of my generation, I grew up on Schoolhouse Rock, a series of animated educational shorts that aired…

  • Modeling Queries as Bags of Documents

    Modeling Queries as Bags of Documents

    Last week, I had the honor of presenting “Modeling Queries as Bags of Documents” at Search Solutions 2024 with Aritra…

  • Documents, Queries, and Categories

    Documents, Queries, and Categories

    I have published a number of posts and presentations about the bag-of-documents model, which essentially represents…

  • Where Do Categories Come From?

    Where Do Categories Come From?

    In my previous post, I argued that categories are fundamental for search applications. I characterized a robust set of…

    1 条评论
  • Categories are Fundamental for Search

    Categories are Fundamental for Search

    As a search consultant, I have learned to be flexible about structured data. However, I do insist on content being…

    5 条评论

社区洞察

其他会员也浏览了