Understanding key concepts of Statistics
The key question for Artificial Intelligence is 'Can Machines Think!'
For understanding machine learning algorithms, the basic concepts required are related to Data Analytics.
R and MATLAB is the programming languages which needs to be known for data analytics
Descriptive Statistics and Exploratory Data Analysis
1. Measures of central tendency
- Summarizing the data
Understanding of Mean, Median and Mode.
2. Measures of dispersion
- Inherent variability
Understanding of Variance or standard deviation
Probability distribution is richest way to express the data. Understanding of entire characterization of data set
Inferential Statistics
Deals with Population and Samples from a broader universe
CASE: Have finite population. Defining state-space.
Population: Height of every single student of university
Sample: 30 randomly chosen, for instance, people from Population
We have some output (mean/median) and we compare it to some probable value.
ANNOVA is used for analysis of variance.
Regression
Regression analysis used inferential statistics and many other modules for essence of machine learning.
Ordinary least square Technique is used for regression
Its about creating relationship between dependent variable (output variable) with independent variable (input variable).
Independent Variable: Annual Rainfall in particular rural region measured annually
Dependent Variable: Crop yield.
Dataset of input and outputs. Core idea is to create relationship between these to factors. We can fit a line through the data to represent relationship.
Finally, prediction is the key idea. For given independent variable, I need to predict dependent variable.
______________________________________________________________
Reference: nptel.iitm.ac.in
For understanding machine learning algorithms, the basic concepts required are related to Data Analytics.
R and MATLAB is the programming languages which needs to be known for data analytics
Descriptive Statistics and Exploratory Data Analysis
- How do we describe data
- Concerned with data visualization graphically
- Different ways of representing data
1. Measures of central tendency
- Summarizing the data
Understanding of Mean, Median and Mode.
2. Measures of dispersion
- Inherent variability
Understanding of Variance or standard deviation
Probability distribution is richest way to express the data. Understanding of entire characterization of data set
Inferential Statistics
Deals with Population and Samples from a broader universe
CASE: Have finite population. Defining state-space.
Population: Height of every single student of university
Sample: 30 randomly chosen, for instance, people from Population
We have some output (mean/median) and we compare it to some probable value.
ANNOVA is used for analysis of variance.
Regression
Regression analysis used inferential statistics and many other modules for essence of machine learning.
Ordinary least square Technique is used for regression
Its about creating relationship between dependent variable (output variable) with independent variable (input variable).
Independent Variable: Annual Rainfall in particular rural region measured annually
Dependent Variable: Crop yield.
Dataset of input and outputs. Core idea is to create relationship between these to factors. We can fit a line through the data to represent relationship.
Finally, prediction is the key idea. For given independent variable, I need to predict dependent variable.
______________________________________________________________
Reference: nptel.iitm.ac.in
Comments
Post a Comment