登录查看更多内容

Quick question from data science and machine learning interview | Part 5

Onurdesk

Spring, Java, nodejs, tutorial at onudesk

发布日期: 2024年11月1日

1. Explain Gradient Descent algorithm.

Ans. Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient. In machine learning, we use gradient descent to update the parameters of our model. Parameters refer to coefficients in Linear Regression and weights in neural networks.

2. What is logistic regression used for classification instead of linear regression?

Ans. Using linear Regression , all predictions >= 0.5 can be considered as 1 and rest all < 0.5 can be considered as 0. But then the question arises why classification can’t be performed using it? Suppose we are classifying a mail as spam or not spam and our output is y, it can be 0(spam) or 1(not spam). In case of Linear Regression, hθ(x) can be > 1 or < 0. Although our prediction should be in between 0 and 1, the model will predict value out of the range i.e. maybe > 1 or < 0. So, that’s why for a Classification task, Logistic/Sigmoid Regression plays its role.

3. What is the Gini Index?

Ans. Gini Index is a score that evaluates how accurate a split is among the classified groups. Gini index evaluates a score in the range between 0 and 1, where 0 is when all observations belong to one class, and 1 is a random distribution of the elements within classes. In this case, we want to have a Gini index score as low as possible. Gini Index is the evaluation metrics we shall use to evaluate our Decision Tree Model.

领英推荐

Heatmaps: FiftyOne Computer Vision Tips and Tricks –…

Voxel51 1 年前

Naive bayes Classification

Bluechip Technologies Asia 9 个月前

The Dispatch | Launching DLDC 2023

ADaSci 2 年前

4. Why is DBSCAN used over K means and other clustering methods?

Ans. Partitioning methods (K-means, PAM clustering) and hierarchical clustering work for finding spherical-shaped clusters or convex clusters. In other words, they are suitable only for compact and well-separated clusters. Moreover, they are also severely affected by the presence of noise and outliers in the data.

Real life data may contain irregularities, like:

Clusters can be of arbitrary shape like non convex clusters
Data may contain noise.
Given such data, k-means algorithm has difficulties in identifying these clusters with arbitrary shapes.

ENJOY LEARNING ????

要查看或添加评论，请登录

Onurdesk的更多文章

See all articles

Quick question from data science and machine learning interview | Part 5

Onurdesk

Spring, Java, nodejs, tutorial at onudesk

领英推荐

Onurdesk的更多文章

社区洞察

其他会员也浏览了

How does your machine learn?

Synerise Monad: Apply science to behavioral data. Automatically.

Extracting Graph Level Features from Graphs for Machine Learning Models: Part 4 of X of my notes

Augmentation Data Deep Dive

Karthick's Sunday Learning (17/11)

Machine learning vs Statistics

MICE or ML? A Purrfect Solution for Data Imputation

ML Day 8: Basic ML Algorithms Every IT Professional Should Know

Forecasting Stock Prices and Realized Volatility: A Hybrid Approach Using LSTM, SARIMAX, and Topological Data Analysis

Expected Time of Arrival Predictor

领英推荐

Onurdesk的更多文章

?? How ?? Retrieval Augmented Generation (RAG) Enhances ?? Generative AI for ?? Businesses

Comparing SVM and Logistic Regression with Outliers ??

How Agentic RAG: Transforming Information Retrieval into Intelligent Decision-Making

Data Science and Machine Learning Q&A

A Comprehensive Guide to Logistic Regression in Handling Outcomes

Commonly used Python libraries are:

Quick question from data science and machine learning interview | Part 3

Quick question from data science and machine learning interview Part 2

Quick question from data science and machine learning interview #Part1

Data Science Interview Q&A | Part 4

社区洞察

其他会员也浏览了

How does your machine learn?

Synerise Monad: Apply science to behavioral data. Automatically.

Extracting Graph Level Features from Graphs for Machine Learning Models: Part 4 of X of my notes

Augmentation Data Deep Dive

Karthick's Sunday Learning (17/11)

Machine learning vs Statistics

MICE or ML? A Purrfect Solution for Data Imputation

ML Day 8: Basic ML Algorithms Every IT Professional Should Know

Forecasting Stock Prices and Realized Volatility: A Hybrid Approach Using LSTM, SARIMAX, and Topological Data Analysis

Expected Time of Arrival Predictor