Tutorial 01: Getting Started With Statistics
An intuitive introduction to statistics
Overview
-
Statistics is often hailed as one of the most useful areas of mathematics.
-
It helps us to make educated guesses of the unknown, and find useful information in an ocean of data. But despite its usefulness, many people struggle with statistics: What is it? How does it work? Is it useful?
-
For many statistics feels like an unending collection of rules and formulae. If its misapplied, statistics can lead to false conclusions, causing people to develop mistrust in the subject.
-
In this tutorial we will get to know What statistic is? Why do we need it?, and How can it be useful?
What is Statistics?
Defining The Term
Statistics is the science of collecting, organising, analysing, interpreting and presenting data.
-
Let's now break down the definition to understand its meaning.
- The most important thing is:it's all about data. Without data, statistics can not exist.
-
So the first step is to collect data. There are a lot of ways to collect data like conducting a survey or performing experiments and recording the obtained data.
-
Once we have the data, the next step is to organise it, i.e. putting the data in some structured order like in a table.
-
Now we have our data in an organised format, we then analyse it. Here we perform techniques like mean, median, mode or other advance techniques to get a sense of what the data is all about.
-
Once we have the data analysed, we then interpret the data, i.e. what does the data tell us, and one very intuitive way to interpret the data is by visualizing it, where we present the data using some graphs or charts.
-
So, in layman's term statistics is the science of learning from data.
Statistical & Non-statistical Questions
-
In layman's term, the questions where we need statistics to conclude something are known as statistical questions and the questions where we don't need statistics to conclude something are known as non-statistical questions
-
Lets first look at some examples, and then we will form a general definition.
-
Example 01: How old are you?
-
Here, we are talking about how old is a particular person. We don't need any tools of statistics here to answer this question. We can just ask the age.
-
So this is a non-statistical question.
-
-
Example 02: How old are the people who have watched a particular YouTube video in 2020?
-
Here, we are assuming that multiple people will have watched the video in 2020, and they are not going to be of the same age. There is going to be variability in their age. One can be of 10 years old others might be 20 and so on.
-
What answer do we give here? Here we want to get a sense of in general, how old are the people? So, this is where the statistics might be valuable. We want to find here, let say an average age for this.
-
So this is a statistical question.
-
-
Example 03: Do wolves weigh more than dogs?
-
So once again here is variability in the weights of dogs and wolves. Some dogs are light, and some are heavy, same goes for wolves as well.
-
Since we have variability in each of the categories, we might want to use statistics to answer this given question like finding an average and then comparing average to get an answer.
-
So, this is a statistical question.
-
-
Example 04: What was the difference in rainfall between Singapore and Seattle in 2020?
-
Now, these two numbers are known and can be measured. The rainfall in Singapore can be measured, as well as in Seattle, and then we can find the difference.
-
So, we don't need statistics here hence it's a non-statistical question.
-
-
-
Here is a pattern, in non-statistical questions we are asking about a particular individual. There is only one answer, due to which there is no variability in the answer.
-
In statistical questions we are asking about a bunch of individuals, and there is a variability in the answer. So, we need statistics here to come up with some features of the data set to be able to make some conclusions.
The statistical question is one that can be answered by collecting data and where there will be variability in that data.
Where Do We Use Statistics?
-
Let's start by seeing where statistics are involved in our everyday life, and then we will see how statistics plays a significant role in many different fields.
-
One familiar example we have here is how we manage our yearly or monthly budget. We go through how much money did we spent last year, what was the average money we spent, in which area we were spending the most, how much money did we saved? So all these questions come to our mind. It is only because of statistics and data we test out different things to save a little bit more money.
-
Other examples in our daily lives can be:
-
We can use statistics to see how are we performing in our exams.
-
It can help us see are we living a healthy life or not (by looking at what are we eating, how much are we exercising, what is our junk-food intake)
-
We use statistics when we look at the weather forecast and decide what to wear.
-
-
Other areas in which uses statistics heavily are:
-
Netflix uses it to predict what show we might want to watch next.
-
Amazon uses it to recommend different products.
-
Government use it to decide whether or not to invest more in child education or on any other sector.
-
Statistics plays a very significant role in all the medical studies (from drug creation to cancer treatment).
-
Stock market uses a statistical technique for stock analysis.
-
-
So, in some way or other statistics play a very vital role in our lives, from solving our family budget problem to diagnosis and treatment of deadly diseases to helping businesses in increasing their profits. It helps us to understand the world a little bit better through numbers and other quantitive information.
Important: Statistics is the building block for Data Science. It is a necessary skill that any data scientist or analyst should have. Data scientists and analysts use statistics to find meaningful trends in the data, and it is only possible if one knows statistics and how to use it.
How Is Statistics Useful?
-
We are living in a world full of data, and its amount is progressively increasing day by day. Some form of data is present in almost every field and to understand the meaning behind the data, you need statistics.
-
Statistical knowledge helps you use the proper methods to collect the data, employ correct analysis, and effectively present the results.
-
It is a crucial process behind how we make discoveries, make decisions based on the data and make predictions.
-
And allows us to understand a subject(be it medical science or sports) much more deeply.
-
So to conclude no matter what field we are in our work will be involved with data in some form or the other, and it is good to know statistics which will help us to understand the problem a little bit better.
Questionnaire
Ques 01. Classify the following questions into statistical and non-statistical question. Also, explain the reason.
1.1. Do dogs run faster than cats?
1.2. Does your dog weigh more than that wolf?
1.3. Does it rain more in Seattle than Singapore?
1.4. In general, will I use less gas driving at 40 kmph than 50 kmph?
1.5. Do English professors get paid less than math professors?
1.6. Does the most highly paid English professor at Harvard get paid more than the most highly paid math professor at MIT in 2020?
Ques 02. Data representing the heights of the students in a mathematics class is specified. Write down a statistical question you can answer using the data provided.
Ques 03. Can you think of any other way(other than the listed examples in the above notes) you can use statistics in your daily life?
Ques 04. Your college football(or any sport you want) coach wants his team to perform better in the upcoming tournament and ask for your help. Is there any way you can use statistics to help him out(given that you have data on the team's past performances)? If yes, then how?
2. Notes are compiled from MySecretMathTutor, Khan Academy, Anywhere Math, CrashCourse and results from Google↩
3. If you face any problem or have any feedback/suggestions feel free to comment.↩