Tutorial 08: Qualitative Data Graphs
Introduction to Pie Charts and Bar Graphs
## required packages/modules
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import rcParams
from IPython.display import display, HTML
## default font style
rcParams["font.family"] = "serif"
## format output
CSS = """
.output {
margin-left:20;
}
"""
HTML('<style>{}</style>'.format(CSS))
- In contrast to quantitative data graphs that are plotted along a numerical scale, we plot qualitative graphs using non-numerical categories.
Pie Charts
Defining The Term
-
A pie chart (or a circle chart) is a circular statistical graphic divided into slices to illustrate numerical proportion.
-
The whole circle represents 100% of the data, and the slices of the pie represent a percentage breakdown of the sublevels.
-
A typical pie chart looks like this:
# Pie chart, where the slices will be ordered and plotted counter-clockwise:
labels = ("Birds", "Cats", "Dogs", "Cows")
sizes = [15, 30, 45, 10]
# only "explode" the 2nd slice (i.e. 'Hogs')
explode = (0, 0.1, 0, 0)
# create subplot
fig, ax = plt.subplots(figsize=(12,8), facecolor="#222222", dpi=600)
ax.set_facecolor("#222222")
# add pie chart
ax.pie(
sizes, explode=explode, labels=labels,
autopct='%1.1f%%', startangle=90,
textprops={"color": "#F2F2F2", "size": 15}
)
# Equal aspect ratio ensures that pie is drawn as a circle.
ax.axis('equal')
fig.text(
0.75, 0.15, "graphic: @slothfulwave612",
color="#F2F2F2", size=10, fontstyle="italic"
)
plt.show()
-
They are widely used in business to depict budget categories, market share and time/resource allocations.
-
Since a pie chart can lead to less accurate judgement, its use has been minimized in science and technology.
- It is more difficult to interpret the relative size of angles in a pie chart than to judge the length of rectangles(or bars).
-
A few things to note here:
-
We represent a pie chart circularly, the total angle formed at the centre is 360°.
-
The angle at the centre corresponding to the particular observation component is given by: $\frac{Value of Component}{Total Value} * 360°$
-
If the values of the observation/components are expressed in percentage, then the central angle corresponding to particular observation/component is given by:
$\frac{Percentage Value of Component}{100} * 360°$
-
-
So, for our above example Cats have 30% value. Since its a percentage value we can use the second formula to calculate the angle for that particular section (where Cats has been assigned).
- $\frac{30}{100} * 360 = 108°$
Steps to Draw a Pie Chart
-
Follow these steps to construct a pie chart:
-
Find the central angle for each component using the formula given above.
-
Draw a circle of any radius.
-
Draw a horizontal radius.
-
Starting with the horizontal radius, draw radii, making central angles corresponding to the values of respective components.
-
Repeat the process for all the components of the given data.
-
These radii divide the whole circle into various sectors.
-
Now, shade the sectors with different colors to denote various components.
-
Thus, we obtain the required pie chart.
-
-
Let's see an example: The following table shows the numbers of hours spent by a child on different events on a working day. Represent the adjoining information on a pie chart.
Activity | No. of Hours |
---|---|
School | 6 |
Sleep | 8 |
Playing | 2 |
Study | 4 |
T.V. | 1 |
Others | 3 |
-
The first step is to calculate the central angles for various observations. Since, the values are not percentages we can use the following formula: $\frac{Value of Component}{Total Value} * 360°$
-
Table after adding angles for various observations:
Activity | No. of Hours | Measure of Central Angle |
---|---|---|
School | 6 | $\frac{6}{24}*360° = 90°$ |
Sleep | 8 | $\frac{8}{24}*360° = 120°$ |
Playing | 2 | $\frac{2}{24}*360° = 30°$ |
Study | 4 | $\frac{4}{24}*360° = 60°$ |
T.V. | 1 | $\frac{1}{24}*360° = 15°$ |
Others | 3 | $\frac{3}{24}*360° = 45°$ |
- Now we can simply represents these angles within the circle as different sectors. Then we make the pie chart:
# Pie chart, where the slices will be ordered and plotted counter-clockwise:
labels = ("School", "Sleep", "Playing", "Study", "TV", "Others")
sizes = [90, 120, 30, 60, 15, 45]
# create subplot
fig, ax = plt.subplots(figsize=(12,8), facecolor="#222222", dpi=600)
ax.set_facecolor("#222222")
# add pie chart
ax.pie(
sizes, labels=labels,
autopct='%1.1f%%', startangle=0,
textprops={"color": "#F2F2F2", "size": 15}
)
# Equal aspect ratio ensures that pie is drawn as a circle.
ax.axis('equal')
fig.text(
0.75, 0.15, "graphic: @slothfulwave612",
color="#F2F2F2", size=10, fontstyle="italic"
)
plt.show()
-
To calculate the percent here, one can use the following formula:
$Percentage Value = \frac{Angle * 100}{360°}$
-
For example, the School sector has an angle of 90° so the percentage value is equal to
$\frac{90 * 100}{360°} = 25\%$
Bar Graphs
Defining The Term
-
Another widely used qualitative data graphing technique is the bar graph or bar chart.
-
A bar graph contains two or more categories on one axis and a series of bars, one for each category, along the other axis.
-
The length of the bars represents the magnitude for each category.
-
The bar graph is qualitative because the categories are non-numerical, and it can either be horizontal or vertical.
-
We construct a bar graph from the same type of data that we use in a pie chart. The advantage here is: for categories that are close in values - it is easier to see the difference in the bars rather than pie slices.
Steps to Draw a Bar Graph
-
Follow these steps to construct a bar chart:
-
Draw the x-axis and the y-axis.
-
Along the x-axis, choose the uniform width of bars and the uniform gap between the bars and write the data items whose values are to be marked.
-
Along the y-axis, choose a suitable scale to determine the bar's heights for the given values. (We take frequency along the y-axis)
-
Draw the bars for each category (or data items).
-
The obtained result is a bar graph.
-
-
Let's see an example: The following table shows the numbers of hours spent by a child on different events on a working day. Represent the adjoining information on a pie chart.
Activity | No. of Hours |
---|---|
School | 6 |
Sleep | 8 |
Playing | 2 |
Study | 4 |
T.V. | 1 |
Others | 3 |
- So, our first step is to draw the axis and annotate it.
def create_axis(
xticks, yticks, xlim, ylim, category,
xlabel="Categories",
ylabel="Frequency"
):
"""
Function to create axis.
Args:
xticks (numpy.array): xtick values.
yticks (numpy.array): ytick values.
xlim (tuple): x-limit.
ylim (tuple): y-limit.
xlabel (str, optional): X label value.
ylabel (str, optional): y label value.
Returns:
figure.Figure: figure object.
axes.Axes: axes object.
"""
# create subplot
fig, ax = plt.subplots(facecolor="#121212", figsize=(12,8), dpi=600)
ax.set_facecolor("#121212")
# hide the all the spines
ax.spines["right"].set_visible(False)
ax.spines["top"].set_visible(False)
# change color
ax.spines['bottom'].set_color("#F2F2F2")
ax.spines['left'].set_color("#F2F2F2")
# change color of tick params
ax.tick_params(axis='x', colors="#F2F2F2")
ax.tick_params(axis='y', colors="#F2F2F2")
# set ticks
ax.set_xticks(np.round(xticks, 2))
ax.set_yticks(np.round(yticks, 2))
# set x-tick-labels
ax.set_xticklabels(category)
# set labels
ax.set_xlabel(xlabel, color="#F2F2F2", size=20)
ax.set_ylabel(ylabel, color="#F2F2F2", size=20)
# setting the limit
ax.set(xlim=xlim, ylim=ylim)
# credits
fig.text(
0.9, 0.02, "graphic: @slothfulwave612",
fontsize=10, fontstyle="italic", color="#F2F2F2",
ha="right", va="center"
)
return fig, ax
fig, ax = create_axis(
xticks=np.arange(1,13,2), yticks=np.linspace(1,9,9),
category=("School", "Sleep", "Playing", "Study", "T.V.", "Others"),
xlim=(0,12), ylim=(0,9), xlabel="Activity", ylabel="No. of Hours"
)
- Now, we will draw the bars. For School the No. of Hours i.e. our frequency is 6. So, we will add a bar of length 6.
# draw bar for School
ax.bar(
x=1, height=6, hatch=5*'/', color="skyblue"
)
fig
- Similarly, we will draw for Sleep. The frequency is 8.
# draw bar for Sleep
ax.bar(
x=3, height=8, hatch=5*'/', color="skyblue"
)
fig
- And we will add bar for all the remaining categories, and the final plot will look like this:
# create axis
fig, ax = create_axis(
xticks=np.arange(1,13,2), yticks=np.linspace(1,9,9),
category=("School", "Sleep", "Playing", "Study", "T.V.", "Others"),
xlim=(0,12), ylim=(0,9), xlabel="Activity", ylabel="No. of Hours"
)
# add bars
ax.bar(
x=np.arange(1,13,2), height=[6, 8, 2, 4, 1, 3],
hatch=5*'/', color="skyblue"
)
plt.show()
Questionnaire
Ques 01: According to the National Retail Federation and Center for Retailing Education at the University of Florida, the four main sources of inventory shrinkage are employee theft, shoplifting, administrative error, and vendor fraud. The estimated annual dollar amount in shrinkage ($ millions) associated with each of these sources follows:
Sources of Inventory Shrinkage | Amount (USD millions) |
---|---|
Employee Theft | 17,918.6 |
Shoplifting | 15,191.9 |
Administrative Error | 7,617.6 |
Vendor Fraud | 2,553.6 |
Construct a pie chart and a bar chart to depict these data.
Ques 02: According to T-100 Domestic Market, the top seven airlines in the United States by domestic boardings in a recent year were Southwest Airlines with 81.1 million, Delta Airlines with 79.4 million, American Airlines with 72.6 million, United Airlines with 56.3 million, Northwest Airlines with 43.3 million, US Airways with 37.8 million, and Continental Airlines with 31.5 million. Construct a pie chart and a bar graph to depict this information.
1. Notes are compiled from Business Statistics by Ken Black and Math Only Math↩
2. If you face any problem or have any feedback/suggestions feel free to comment.↩