Difference Between Statistics and Machine Learning

Statistics vs Machine LEarning

Statistics and Machine Learning share similar goals that is learning from data. Both of them try to use data to improve the decision making procedure. Sometimes they are even used interchangeably. Machine learning finds the generalizable predictive patterns, Statistics, on the other hand, draws the inference of the population from a sample. The boundary between machine learning and statistics is subject to debate. Some methods fall into only one domain while many are used in both of the fields. For example time series is used only in statistics while linear regression is used in both. Currently, these fields share a lot of aspects while they were quite different in the beginning. Statistics was evolved much before Machine Learning. It was already a fully developed discipline until 1920, the most vital contribution was by R. Fisher who coined maximum likelihood estimation (MLE) as a standard tool for statistical inference. Machine learning is built upon statistics. It takes a lot from statistics. Machine learning involves data and data has to be described in terms of statistics. So, It is obvious that machine learning uses a lot of statistics. Machine learning is based on statistical learning theory.
Classical Statistics and machine learning differ in the computational tractability as the number of data/variables per subject rises. Machine learning is an algorithm that can learn from data without depending on the standard programming practices such as Object-Oriented Design (OOD). It was invented in 1959 while statistics was invented in the 17th century. Machine learning often results in more detailed information and fewer assumptions than statistics. Statistics is the formalization of relationships between different variables in the form of mathematical equations. 

The Actual Difference Between Statistics and Machine Learning

The machine learning favors a BlackBox approach where the goal is to replicate input/output pairs from past observation while the statistical approach opens the BlackBox and models the relationship. Statistics is concerned with finite sample analysis, model misspecification and computational consideration, on the other hand, machine learning inherits the probabilistic modeling. Between these two approaches there lies the data and statistical learning theory which is common in both methods. Statistical modeling is a generative approach to statistical learning theory while machine learning is the discriminative approach. 
The Actual Difference Between Statistics and Machine LearningPerhaps the biggest difference between these two fields is their inclination ie they emphasize different things. Although they use similar techniques and tools they have philosophical differences in how and when those techniques should be used. Machine learning is more focused on developing a software system that can make predictions ie it gives more emphasis to software engineering. It is often said that machine learning is developed by computer scientists because they needed a way to create a computer system that can learn from data and make predictions. Statistics is more of a mathematical discipline. Methods like linear regression are adopted by machine learning from the statistical modeling realm.

In his blog post Larry Wasserman explained this very topic (statistics versus machine learning). If you don't know him, he is a professor of both Department of statistics and in the machine learning department at Carnegie Mellon, one of the premier universities that have own devoted departments for machine learning. He has written lots of blogs on machine learning and statistics. I will try to summarize that blog post. He mentioned that statistics emphasize the formal statistical inference ( optimal estimators, confidence intervals, hypothesis tests) in low dimensional problems while Machine learning is inclined to high dimensional prediction problems. But this is just a simplification. If we have to list some topics that receive more attention to a single field are:

Statistics: Spatial Analysis, Minimax Theory, Semiparametric inference, time series, survival analysis, multiple testing, deconvolution, bootstrapping, etc.

Machine Learning: active learning, boosting, online learning, semisupervised learning, manifold learning, etc.

There is lots of overlapping of topics between these two. For example, Reproducing Kernel Hilbert Space(RKHS) phenomenon is trending in machine learning which was at first began in Statistics. Similarly, Concentration of measure, convex optimization and sparsity are all highly active in both the disciplines. Online learning also has its base in the field of statistics.

One of the authors of the excellent book An Introduction to Statistical Learning Prof. Rob Tibshirani created a list of words having similar meaning in these fields in his paper Several major terms in machine learning vs Statistics.

Some naming conventions with similar meaning in both fields are:

The Actual Difference Between Statistics and Machine Learning
Terminologies with Similar meaning in Statistics and Machine Learning
These words can be used interchangeably. For example, If you take the courses of Andrew Ng, you will find that he usually uses the term parameter instead of weight, which is more of a statistical term although He is renowned for his contribution to machine learning.

Now, let's talk about the tools these fields use. You are more likely to see python and Matlab in Machine learning. You may also see other languages like Java, C++ and many more. R is a more popular language among statisticians. Statisticians use mathematical methodologies.

Some of the differences between Statistics and Machine Learning are:

  • Machine learning is developed by computer scientists while Statistics is developed by mathematicians.
  • Machine learning is built upon statistical frameworks.
  • Statistics was developed in the 17th century, MAchine learning was developed in 1959.
  • Machine learning is a subfield of Artificial Intelligence. Statistics is a subfield of Mathematics.
  • Machine learning finds the generalizable predictive patterns while statistics draw population inference from a sample.
  • Machine learning is a BlackBox approach. Statistics opens the BlackBox.
  • Machine learning needs a very large amount of data and attributes while Statistics need less.
  • Statistics require mathematical knowledge. Machine learning requires both mathematical and algorithms knowledge.
  • Statistics use the correlation between the data points while machine learning is used for making a hypothesis.
  • ML makes fewer assumptions than statistics.
  • Machine learning has more predictive power.
  • Machine learning requires less human effort than statistics.
  • Machine learning uses algorithms. Statistics uses equations.
  • They use different tools

At last, I want to conclude by saying that machine learning and statistics are essentially equivalent but are practiced by different communities. Currently, these communities are collaborating and sharing ideas. Machine learning gives importance to the computational aspect. These fields are interrelated. The rise of machine learning has been very beneficial to statistics as well as ML is benefited with a great deal by statistics.


  1. I praise your exercise in trying to draw a clear line between statistics and machine learning. But I believe the effort is useless. Non-parametric statistics (black box approaches) existed much earlier than anyone ever talked about machine learning. The issue is really that more data and more (cheap) computational power is available now than in the past. Because of that, a lot of new (non-parametric) approaches popped up during the last 3 decades to benefit from that. There are now more approaches for data analysis and prediction. Selecting the most promising approach requires a lot of experience with available techniques and computational tools. And this experience is not mastered easily by only attending a couple of online courses called Statistics and/or Machine Learning and/or Data Science at Coursera, Udemy or similar platforms. Some might underestimate the role of domain specific knowledge in solving a problem, and that is, imho, the source of important failures in prediction or inference in many cases.

  2. Thank you Sir!
    Pleased to hear your opinion.
    I wish I could agree with you.

To Top