Data Science Resources, ETL Practices, Beginner’s guide to Seaborn

Data Science Resources, ETL Practices, Beginner’s guide to Seaborn

1. Most Active Data Scientists, Free Books, Notebooks & Tutorials on Github

In this article, I’ve listed the most active data scientist on GitHub, so that you can follow & see what are they up to (especially projects). Before moving forward, check out this ~ 2 minutes video on students using Github!


  • Open Source Data Science?– This repository encourages you to leverage open-source education and become a self-taught data scientist. If you like reading books, and prefer to gain knowledge from books than any other method, you have a lot to take home from this repository.
  • Python Projects?– Keen to do interesting python projects but don’t know where to start? Check out some interesting projects done in python, understand them, and maybe they could inspire you to start one on your own.

If you check their profiles, you’d realize that they have avidly contributed knowledge in form of books, projects, and tutorials for the welfare of the worldwide ML community. You can check out the original article?here.

Level:?Beginner

2. Good ETL Practices with Apache Airflow

  • In this process, data is pulled (extracted) from a source system, to move into a format that can be analyzed, and stored in a warehouse or other system.
  • Extract, Load, Transform (ELT) is an alternative, albeit related, approach designed to push processing to the database to improve performance.
  • In this guide we will cover the good practices of ETL implementation, using the Datastream Implemented through the Apache Airflow platform.
  • If there is no error, access the Apache Airflow user interface the address (*Wait about 5 minutes before opening the terminal)
  • To get a full picture of their assets and errands, they move data from that large number of sources into a data dispersion focus or data lake and run assessments against it.
  • Connectors: Data sources and objections In a digital technology ecosystem, several devices contain a great diversity of data and objects, stored in object storage, which can be defined as a Data Lake, and a set of these constitute Big Data.?

Check out this amazing introductory video on ETL by?codebasics. You can also check out the entire article?here.


Level:?Beginner

3. A Beginner’s Guide To Seaborn: The Simplest Way to Learn

No alt text provided for this image


  • Seaborn allows us to make complicated plots even in a single line of code!
  • In this tutorial, we will be using three libraries to get the job done — Matplotlib, Seaborn, and Pandas.
  • A box plot is used for depicting groups of numerical data through their quartiles.
  • Factor plots make it easy to separate plots by categorical classes. You can make more visualizations like these, by simply changing the variable names and running the same lines of code.

Level:?Beginner

You can check out the entire article?here.

Conclusion

I hope you found this blog post insightful. Please do share it with your friends & family. You can reach out to me on?LinkedIn. I am quite active here & I will be happy to have a conversation with you. Please feel free to drop your feedback in the comments that helps me to improve the quality of my work. I will keep on sharing more content as I grow & mature as a Data Scientist. Until next time,?Keep Hustling & Keep Up with Data Science. Happy Learning?

Balaji Seetharaman ??

LinkedIn CAP | SW Engineer | FOSS Enthusiast | Qiskit Advocate. Follow me for content on ML & AI, Linux and Algorithmic Investing

2 å¹´

Subscribed Chitwan Manchanda

赞
回复
Gaurav Ahuja

Self-Employed | Associate Consultant at PwC | Delhi Technological University '22

2 å¹´

Thank you

赞
回复

要查看或添加评论,请登录

Chitwan Manchanda的更多文章

  • Linear Regression Explained!

    Linear Regression Explained!

    In my last post, I shared a list of questions asked in the Microsoft interview for the Data & Applied Scientist role…

    1 条评论
  • Mask R-CNN & IBM's Power Servers

    Mask R-CNN & IBM's Power Servers

    Computer Vision Tutorial: Implementing Mask R-CNN for Image Segmentation (with Python Code) The latest state-of-the-art…

  • Top ML Reddit Discussions, NLP Roadmap & Much More!

    Top ML Reddit Discussions, NLP Roadmap & Much More!

    Top 5 Machine Learning GitHub Repositories & Reddit Discussions Why do we include Reddit discussions in this series? I…

  • Learning Data Science, Winning Solutions in Hackathons, AR Models & Much More!!

    Learning Data Science, Winning Solutions in Hackathons, AR Models & Much More!!

    1. A Super Useful Month-by-Month Plan to Master Data Science If you love problem-solving, number crunching, and data…

    2 条评论
  • Math Heavy Topics!

    Math Heavy Topics!

    1. Support-vector machine Given a set of training examples, each marked as belonging to one of two categories, an SVM…

    5 条评论
  • Analyzing Diabetes Patterns amongst Indians, A Beginner’s Guide to Pearson’s Correlation Coefficient, Deep Learning in Cyber Security & Much More!

    Analyzing Diabetes Patterns amongst Indians, A Beginner’s Guide to Pearson’s Correlation Coefficient, Deep Learning in Cyber Security & Much More!

    1. Juicing out the Diabetes Patterns amongst Indians using Machine Learning The data indicates an increase of 266% in…

  • Interview with a Kaggle Master, GANS & Much More!

    Interview with a Kaggle Master, GANS & Much More!

    1. Exclusive Interview with 2x Kaggle Master Gilles Vandewiele! “I think one of the nice things about the data science…

    2 条评论
  • Mix It Up!!!!

    Mix It Up!!!!

    1. 12 Matrix Operations You Should Know While Starting your Deep Learning Journey So, In this article, we will discuss…

    1 条评论
  • Learning Linear Regression

    Learning Linear Regression

    In this post, I’ll be sharing functional, informative, and relevant content on data science from the internet…

  • Natural Language Processing Usecases

    Natural Language Processing Usecases

    1. Master Natural Language Processing in 2022 with Best Resources As already mentioned earlier, Deep Learning is a…

社区洞察

其他会员也浏览了