登录查看更多内容

Data Analysis Project

Heerthi Raja H

Computer Vision | CV/Robotics Enthusiast | Sharing my lessons | Learning and building in public!

发布日期: 2023年10月22日

+ 关注

Analysing Placement dataset and taking insights from the data.

Importing the Libraries.

Dataset

First, we want to check whether null or nan values are there in the dataset.

1. Replace the Nan values with the correct ones and justify why you chose the same.

In the given dataset about placements, Salary has 67 nan values.

Not-placed students cannot get a salary. So, we replaced the value of their salary as 0 using fillna.

2. How many of them are not Placed?

The number of Not Placed students is 67.

3. Find the reason for non placement from the dataset.

- We going to find the median of not-placed students and placed students and then compare them to find the reason.

The reason for non-placement:

We found a median for non-placed and placed students. we compared them. Medians show thee exact average of students in every stage.

By that, we found, that the average of Non Placed students from ssc_p to mba_p is less than 68%. The average of Placed students is above 68% from ssc_p to etest_p.

From this, we can justify that the students who got below 68% are not placed. who got above 68% are placed. thank you.

4. What kind of relation between salary and mba_p?

By using Correlation we found the relationship between 2 columns, salary, and mba_p. 13% Directly proportional. It is a Positive Correlation.

5. Which specialization is getting a minimum salary?

Mkt&HR and Mkt&Fin specialization getting a minimum salary. The minimum salary is 2,00,000.

6. how many of them get above 500000 salary?

领英推荐

Data Cleaning after Survey Data Collection

Raja Sankaran 3 年前

Methodology and Software for Processing and Analyzing…

Mark Muriithi 4 年前

Excel For Statistical Data Analysis (Part 1)

Minh Nguy?n 6 年前

Ans: 3 of them from the dataset getting above 500000 salary. in that 2 male and 1 female.

7. Test the Analysis of variance between etest_p and mba_p at significance level %% ( Make decision using Hypothesis testing)

ANOVA- Analysis Of Variance

H0-There is no significant between these columns. H1- There is a significant between these columns. Accept H0 ,Reject H1.

P value is greater than 5%. So Accept H0 and Reject H1.

8. Test the similarity between the degree_t(Sci&Tech) and specialization(Mkt&HR) with respect to salary at significant level of 5%(Make decision using Hypothesis testing)

To find similarity we use T-test.

Independent Sample-Unpaired T-test. Different group(degree_t, spcialization) but same condition (salary).

P value is less than 5%. So accept the Alternative hypothesis and Reject the Null hypothesis.

9. Convert the normal distribution to the standard normal distribution for salary columns.

stdNBgraph(dataset["salary"])

10. What is the probability Density Function of the salary range from 700000 to 900000 ?

get_pdf_probability(dataset["salary"],700000,900000)

The probability Density Function of the salary range = 0.0005

11)Test the similarity between the degree_t(sci&Tech) with respect to etest_p and mba_p at significance level of 5%(Make decision using Hypothesis Testing)

Dependent sample- paired T Test. Same ggroup(degree_t)but Different condition(etest_p,mba_p)

Ans: Accept Null hypothesis and Reject alternate Hypothesis. There is no similarity between etest_p and degree_t and mba_p mark and degree_t.

12. Which parameter is highly correlated with salary?

ssc_p and Salary have a high relation. It's 0.538090. others are smaller than this.?

Ans: ssc_p is highly correlated with salary.

13. Plot any useful graph and explain it.

Thank You!

That's about it for this article.

I am always interested and eager to connect with like-minded people and explore new opportunities. Feel free to follow, connect, and interact with me on?LinkedIn,?Twitter,?and?YouTube. My social media---?click here?You can also reach out to me on my social media handles. I am here to help you. Ask me any doubts regarding AI and your career.

Wishing you good health and a prosperous journey into the world of AI!

Best regards,

Heerthi Raja H

Heerthi Raja's Journal

979 位关注者

要查看或添加评论，请登录

Heerthi Raja H的更多文章

From Ideation to Transformation: My 25-Day Entrepreneurial Bootcamp Journey

2025年1月31日

From Ideation to Transformation: My 25-Day Entrepreneurial Bootcamp Journey

My Journey Through the Entrepreneurship Transformation Bootcamp: A Deep Dive into Learning and Growth! The path to…

16 条评论
Building a Blog Generator Using OpenAI API

2024年12月12日

Building a Blog Generator Using OpenAI API

Building a Blog Generator Using OpenAI API: A Step-by-Step Guide As a developer, exploring AI tools and creating…

2 条评论
Building a Medical RAG Chatbot with BioMistral LLM!

2024年12月11日

Building a Medical RAG Chatbot with BioMistral LLM!

Building a Medical RAG Chatbot with BioMistral LLM: A Step-by-Step Guide Generative AI and Retrieval-Augmented…
My First Generative AI Project: SQL Query Generator

2024年12月5日

My First Generative AI Project: SQL Query Generator

This is my first project using Generative AI, and I’m really excited to share it! The project is about creating a tool…

2 条评论
Road Sign Recognition Using Deep Learning and PyQt: A Detailed Guide

2024年8月20日

Road Sign Recognition Using Deep Learning and PyQt: A Detailed Guide

In this article, we will explore a project that integrates computer vision, deep learning, and a graphical user…

4 条评论
Real-Time Drowsiness Detection Using Computer Vision: A Step Towards Safer Roads

2024年8月19日

Real-Time Drowsiness Detection Using Computer Vision: A Step Towards Safer Roads

Introduction In today’s fast-paced world, driving long distances has become a routine for many. However, one of the…
Automating Attendance with a Smart Attendance System: A Deep Dive into Facial Recognition Technology

2024年8月19日

Automating Attendance with a Smart Attendance System: A Deep Dive into Facial Recognition Technology

Introduction In today's fast-paced world, efficiency and accuracy are paramount, especially in administrative tasks…
Building a Gujarati Character Recognition System Using Convolutional Neural Networks and PyQt5

2024年8月18日

Building a Gujarati Character Recognition System Using Convolutional Neural Networks and PyQt5

Introduction Optical Character Recognition (OCR) systems have revolutionized the way we interact with written text by…

2 条评论
Leaf Disease Detection Using Computer Vision

2024年8月15日

Leaf Disease Detection Using Computer Vision

Introduction In the realm of agriculture, early detection of leaf diseases is crucial for maintaining crop health and…

4 条评论
Building an Image Classification Model: Thanos vs. Joker

2024年6月2日

Building an Image Classification Model: Thanos vs. Joker

Introduction As a passionate computer vision enthusiast, I embarked on an exciting journey to build an image…

See all articles

Analysing Placement dataset and taking insights from the data.

Importing the Libraries.

Dataset

1. Replace the Nan values with the correct ones and justify why you chose the same.

2. How many of them are not Placed?

3. Find the reason for non placement from the dataset.

- We going to find the median of not-placed students and placed students and then compare them to find the reason.

The reason for non-placement:

4. What kind of relation between salary and mba_p?

5. Which specialization is getting a minimum salary?

6. how many of them get above 500000 salary?

领英推荐

7. Test the Analysis of variance between etest_p and mba_p at significance level %% ( Make decision using Hypothesis testing)

ANOVA- Analysis Of Variance

H0-There is no significant between these columns. H1- There is a significant between these columns. Accept H0 ,Reject H1.

P value is greater than 5%. So Accept H0 and Reject H1.

8. Test the similarity between the degree_t(Sci&Tech) and specialization(Mkt&HR) with respect to salary at significant level of 5%(Make decision using Hypothesis testing)

Independent Sample-Unpaired T-test. Different group(degree_t, spcialization) but same condition (salary).

9. Convert the normal distribution to the standard normal distribution for salary columns.

10. What is the probability Density Function of the salary range from 700000 to 900000 ?

11)Test the similarity between the degree_t(sci&Tech) with respect to etest_p and mba_p at significance level of 5%(Make decision using Hypothesis Testing)

12. Which parameter is highly correlated with salary?

13. Plot any useful graph and explain it.

Thank You!

Heerthi Raja's Journal

979 位关注者

Heerthi Raja H的更多文章

From Ideation to Transformation: My 25-Day Entrepreneurial Bootcamp Journey

Building a Blog Generator Using OpenAI API

Building a Medical RAG Chatbot with BioMistral LLM!

My First Generative AI Project: SQL Query Generator

Road Sign Recognition Using Deep Learning and PyQt: A Detailed Guide

Real-Time Drowsiness Detection Using Computer Vision: A Step Towards Safer Roads

Automating Attendance with a Smart Attendance System: A Deep Dive into Facial Recognition Technology

Building a Gujarati Character Recognition System Using Convolutional Neural Networks and PyQt5

Leaf Disease Detection Using Computer Vision

Building an Image Classification Model: Thanos vs. Joker

社区洞察

其他会员也浏览了

How to Start a Career in Data Analysis: A Step-by-Step Guide

Strategies for Choosing and Planning a Statistical Analysis

Data Analytics: The Modern-Day Course that is Steadily Growing in Popularity

Become a Data Analyst with GVT Academy’s Comprehensive Course

Sharpening your Skillset "Data Analysis and Interpretation"

Things to Note When Analyzing Data for Your Thesis or Dissertation

TRAINING IN DATA MANAGEMENT GRAPHICS & STATISTICAL ANALYSIS USING SPSS 1st To 5th February 2021-Email [email protected] Call+254727446544

TRAINING IN DATA MANAGEMENT GRAPHICS & STATISTICAL ANALYSIS USING SPSS 15th To 19th J 2021-Email [email protected] Call+254727446544

TRAINING IN DATA MANAGEMENT GRAPHICS & STATISTICAL ANALYSIS USING SPSS 25th To 29th January 2021-Email [email protected] Call+254727446544

How a Data Analyst Course Can Change Your Future