Breaking the Jargons: Issue 6
Parul Pandey
Community ?? & Open Source| Co-author of Machine learning for High-Risk Applications | Kaggle Grandmaster(Notebooks)
Hi there!
I typically publish my newsletters on Revue, but this time I'm additionally posting it on Linkedin as well. Why did I create a newsletter? Simple, I wanted to collate my writings, posts, and other interesting stuff in a single place so that it becomes easy to locate them when needed.
In this edition, there are tutorials, paper summaries, and interviews with book authors. As always, I have shared my favorite data science resource and a tool of the month
?? Articles
ImageNet?is one of the most widely used datasets in Computer Vision applications. However, studies have shown?biases prevalent in this dataset based on the collection methodology and the types of images present.?In this respect, a team of researchers at the?Visual Geometry Group, the University of Oxford, have proposed a?new dataset called PASS for self-supervised (SSL) model pretaining to address privacy and fairness issues specifically. This article is a summary of the paper published by the team.
It is often said that one of the best ways to keep up to date with the latest happenings in the field of machine learning is by reading research papers. However, this is easier said than done. While many find them intimidating, others find it impossible to keep up with the daily dose of published papers. Hence, I have compiled a few tools that I use to organize, manage and read my favorite research papers and get notified of the latest ones.?
??? Interviews
In an endeavor to bring some of the notable work in the field of machine learning to the forefront, I started an interview series last year. During the?first season, I presented stories from established data scientists and Kaggle Grandmasters, who shared their journey, inspirations, and accomplishments. For the second season, I'm interviewing book authors. This edition of the interviews will bring to light the story of some of the well-known authors in the data science field.
To kick off the series, I interviewed Alexey Grigorev, author of the book- Machine Learning Bookcamp, a principal data scientist at OLX, and founder of?DataTalks.Club?— a community for data enthusiasts.
领英推荐
?? Tools
Github Octo
The?Github Octo?project is a way to auto-generate a bird's-eye-view of codebases and understand how our code is structured. The figure below is a visualization of the?H2O-3?repository. You can click on?Try it out for yourself!
?? Resource of the Month
Introduction to Probability for Data Science?is an undergraduate textbook on probability for data science by Stanley H. Chan, who has made the pdf version free. The book provides broad coverage from classical probability theory to modern data analytic techniques and includes code in Matlab, Python, Julia, and R.
Source: https://probability4datascience.com/index.html
That is all for this edition. See you with another roundup next month. You can subscribe to receive the newsletter directly in your mailbox every month or share it with someone who could find them helpful.
Until next month,
Parul
Read the previous editions below:
Data Scientist | Technical Writer | Editor
2 年I am a fan boy and I am proud to say it.
Product Engineer @ SuperKalam | ML and GenAI | Kaggle 1x Expert
3 年Hey Parul Pandey, This newsletter is really amazing.
Application Developer at Fujitsu, Bangalore
3 年Informative. . ??
Infosys | Ex TSE at Seabird Logisolutions Limited
3 年Congratulations and thank you
Machine Learning Engineer
3 年Thanks Parul Pandey for introducing research papers managing tools. I really love arxiv-vanity. It greatly nullifies the huge distance between contents and references.