Statistics for Data Science — Basic Statistics
Statistics is a foundational component of data science, providing powerful tools and techniques for analyzing and interpreting data. Data scientists rely on statistical techniques to extract meaningful insights from large and complex data sets and identify patterns and trends that can contribute to informed business decisions. With solid statistical understanding, a data scientist can better understand the behavior of the data.
In this newsletter series, we will cover everything from foundational theories to advanced analytical techniques and explore their real-world application. This series helps you to build a strong statistical understanding for data science.
What is Statistics?
Statistics is the branch of applied mathematics that deals with collection, Organization, Analysis, Interpretation, Presentation of data.
Example:
Some Key Definition:
Data: Data can be anything and everything . Any information or facts considered as data. Example: age, weight etc.
Population: Population is the collection of all items or individuals of interested to our study. Example: All students in a class.
Types of populations: The population can be classified according to the number of individuals that make it up:
Sample: A sample is a subset of population used to draw conclusions about the population. Example: Some students in a class.
Parameter: Parameters are numbers that describe the properties of entire populations
Statistic: Statistic are numbers that describe the properties of entire sample.
Variable: In statistics variables are numbers or characteristics that can be counted or measured.
Example: age, length, height etc. that can be change or vary.
Types of Variable: According to weather a variable takes numerical of non-numerical values .It can be classified into two categories:
Qualitative Variable: Qualitative variables, also known as categorical variables, describe qualities or characteristics.
Example: Color of a car , Gender of a patient, Size of an industry etc.
Quantitative variable: Quantitative variables, also known as numerical variables, represent quantities or amounts.
Example: Number of children a family, Weight of a man etc.
Scale of Measurement: There are four types of scale as follows:
Nominal Scale: The nominal scale is the simplest form of measurement. It involves classify and identify a qualitative variable according to different categories of group .
Examples:
领英推荐
Ordinal Scale: The ordinal scale is a type of measurement where data is organized into a specific order or ranking. However, while you can tell which item is higher or lower in the order, the exact difference between the ranks isn’t consistent or precisely measurable.
Examples:
Interval Scale: The interval scale not only allows for ordering of data but also provides meaningful and equal intervals between data points.
Examples:
Interval data allows for addition and subtraction, but since there is no absolute zero, multiplication and division do not apply. For instance, 20°C is not “twice as warm” as 10°C.
Ratio Scale: The ratio scale is the most informative and robust scale of measurement. It has all the properties of the interval scale, but it also includes an absolute zero point, which allows for the calculation of ratios.
Examples:
Types of statistics: There are two types of Statistics as follows:
Descriptive Statistics: It is a method of describing and summarizing data in a meaningful way. They provide a way to present data in a meaningful and manageable form, helping you understand what the data shows at a glance.
Key Components of Descriptive Statistics:
Measures of Central Tendency: These are the values that represent the center or typical value of the data set.
Measures of Dispersion (Variability): These metrics show how spread out the data is.
Inferential Statistics: It is a method of draw conclusions and making predictions about a population based on a sample of data.
Key Components of Inferential Statistics:
Null Hypothesis (H0): The hypothesis that there is no effect or difference.
Alternative Hypothesis (H1): The hypothesis that there is an effect or difference.
Thanks for reading .
Your Network is your Networth” — Tim Sanders
Connect on LinkedIn : https://www.dhirubhai.net/in/md-sawrab/
Github: https://github.com/md-sawrab
Data Scientist | Bridging the Gap Between Data & Business Strategy | Experienced in Python, R, & SQL
6 个月Very informative