Scaling offline item recall using BigDL at Yahoo! JAPAN Shopping

Scaling offline item recall using BigDL at Yahoo! JAPAN Shopping

The product recommender system for Yahoo! JAPAN Shopping is a multi-stage recommender system, consisting of offline item search(or item recall), offline item ranking and online re-ranking (as shown above). The item recall pipeline searches, for each product item, 200 similar items within the same product category, using the vector (that is, item embedding) search algorithms provided by Faiss (Facebook AI Similarity Search).

Yahoo! JAPAN Shopping has trained the item embeddings on Nvidia GPU, and previously also tried to run Faiss on Nvidia GPU for the offline item recall pipeline. However, this make the implementation rather complicated as their data are entirely stored on HDFS; in addition, GPU resources are also limited in Yahoo Japan. As a result, Yahoo! JAPAN Shopping has adopted the new offline item recall pipeline using BigDL on Spark in production, which demonstrates more than 3x speedup (using 80 Xeon cores) vs. 4 Nvidia V100, and can be easily scaled to large clusters of hundreds of nodes with minimum efforts.

For more details, you may refer to the technical blog at https://www.intel.com/content/www/us/en/developer/articles/technical/offline-item-search-with-bigdl-at-yahoo-japan.html

Luyang Wang

Senior Director, AI Center at Verizon

2 年

Congrats Jason and team!

要查看或添加评论,请登录

Jason (Jinquan) Dai的更多文章

  • Scaling Giant Model with Google’s GShard

    Scaling Giant Model with Google’s GShard

    I recently came across an interesting paper from Google (GShard: Scaling Giant Models with Conditional Computation and…

    1 条评论
  • Food Recommendation at Burger King

    Food Recommendation at Burger King

    Earlier this month, we published a guest blog at UC Berkeley RISELab blog website, which describes some of the…

  • Seamlessly Scaling AI for Distributed Big Data

    Seamlessly Scaling AI for Distributed Big Data

    Early last month, I presented a half-day tutorial on at this year’s virtual CVPR 2020. This is a very unique…

    1 条评论

社区洞察

其他会员也浏览了