Remember that an outlier of a data set is a value much higher or much lower than the other values.
When a data set is represented as a dot plot, it's easy to pick out the outliers: we have to look for values much further to the left or right than the main cluster of values.
To demonstrate, consider the dot plot below:
Notice that most values are concentrated between and but one value is separated from the main group.
Since this value is separated from the main group, we conclude that is an outlier of the distribution.
How many outliers are there for the data set whose dot plot is shown above?
Notice that most values are concentrated between and but one value is separated from the main group.
The value is the outlier of the distribution. Therefore, there is outlier.
How many outliers are there for the data set whose dot plot is shown above?
a
|
3 |
b
|
4 |
c
|
1 |
d
|
2 |
e
|
5 |
What are the outliers for the data set whose dot plot is shown above?
a
|
95 and 110 only |
b
|
110 only |
c
|
75 and 80 only |
d
|
80 only |
e
|
95 only |
A tail of a distribution is a part that extends away from the main cluster.
If the left tail of the distribution is longer than the right tail, then we say that the distribution is left-skewed (or negatively skewed). An example of a left-skewed distribution is shown below.
Similarly, if the right tail of the distribution is longer than the left tail, then we say that the distribution is right-skewed (or positively skewed). An example of a right-skewed distribution is shown below.
If the tails of the distribution are the same, we say that the distribution is symmetric. An example of a symmetric distribution is shown below.
The dot plot above shows the working hours of a small group of employees. Each dot represents a single employee. What is the shape of the distribution?
From the dot plot, we see that the distribution is left-skewed. It has a long tail on the left-hand side.
The dot plot above shows the daily high temperatures of several US cities on a specific day. Each dot represents a single city. What is the shape of the distribution?
a
|
the distribution is left-skewed |
b
|
the distribution is symmetric |
c
|
the distribution is right-skewed |
d
|
the distribution is left-skewed and symmetric |
e
|
the distribution is right-skewed and symmetric |
The dot plot above shows the heights of a small group of children, where each child's height has been rounded to the nearest ten centimeters. Each dot represents a different child. What is the shape of the distribution?
a
|
the distribution is both left-skewed and symmetric |
b
|
the distribution is right-skewed |
c
|
none of the options describes the shape |
d
|
the distribution is symmetric |
e
|
the distribution is left-skewed |
When a distribution is perfectly symmetric, the mean lies at the midpoint of the distribution.
To demonstrate, consider the symmetric distribution below:
To calculate this distribution's mean (midpoint), we find the mean of the two most extreme values. The largest value is , and the smallest value is Therefore,
Let's now consider the right-skewed distribution below.
Notice that the midpoint (i.e., the mean of the two most extreme values) is also
Since the distribution is right-skewed, most of the data points lie to the left of the midpoint. Consequently, the mean also lies to the left of the midpoint.
The situation is reversed when the distribution is left-skewed.
In this case, most of the data points lie to the right of the midpoint. Consequently, the mean also lies to the right of the midpoint.
The dot plot above shows how much time each student in Mr. Johnson's class spent studying math on a particular day. Each dot represents a single student.
Which of the following statements are true?
- The distribution is approximately symmetric.
- The mean of the distribution equals
- The mean of the distribution exceeds
- The mean of the distribution is less than
Let's examine the statements one by one.
- Statement I is false. The distribution is left-skewed (since we have a longer tail on the left). It is not symmetric.
- Statement III is true, while statements II and IV are false. All of the data is situated between and minutes. If the data were symmetric, then the mean would be given by the midpoint, which is
However, since the distribution is left-skewed, most of the data points are concentrated on the right-hand side. This indicates that the mean of our distribution lies to the right of the midpoint.
So, the mean is greater than minutes.
Therefore, the correct answer is "III only."
The dot plot above shows the height distribution for a particular group of people, where each person's height has been rounded to the nearest centimeter. Each dot represents a different person. Which of the following statements are true?
- The distribution is approximately symmetric.
- The mean of the distribution is less than 155cm.
- The mean of the distribution equals 155cm.
a
|
I and III only |
b
|
III only |
c
|
I only |
d
|
I and II only |
e
|
II only |
The dot plot above shows the number of riders in each team that completed the Tour de France final stage in a particular year. Each dot represents a different team. Which of the following statements are true?
- The distribution is symmetric.
- The mean of the distribution equals 5.
- The mean of the distribution exceeds 5.
- The mean of the distribution is less than 5.
a
|
III only |
b
|
I and II only |
c
|
IV only |
d
|
I and III only |
e
|
II only |