## required packages/modules
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.path import Path
from matplotlib import rcParams

## default fontstyle
rcParams["font.family"] = "Ubuntu"

def get_patch(verts):
    codes = [Path.MOVETO] + [Path.CURVE4] * (len(verts) - 1)
    path = Path(verts, codes)
    patch = patches.PathPatch(path, facecolor='none', lw=1.5, edgecolor="#F2F2F2", alpha=0.7)
    
    return patch

## create subplot
fig, ax = plt.subplots(facecolor="#121212", figsize=(12,8))
ax.set_facecolor("#121212")


## props for text
props = dict(facecolor="none", edgecolor="#D3D3D3", boxstyle="round,pad=0.6", zorder=3)

## coordinates to plot line
line_coords = [
    [(5, 9.5), (5, 8.5), (4.5, 8.5), (1, 8)],      ## line 01
    [(5, 9.5), (5, 8.5), (5.5, 8.5), (9, 8)],      ## line 02
    
    [(1, 6.5), (1, 5.9), (0.5, 5.9), (-1, 5.4)],   ## line 03
    [(1, 6.5), (1, 5.9), (1.5, 5.9), (3, 5.4)],    ## line 04
    
    [(9, 6.5), (9, 5.9), (8.5, 5.9), (7, 5.4)],    ## line 05
    [(9, 6.5), (9, 5.9), (9.5, 5.9), (11, 5.4)],   ## line 06
    
    [(11, 4.4), (11, 3.8), (10.5, 3.8), (9, 3.3)],  ## line 07
    [(11, 4.4), (11, 3.8), (11.5, 3.8), (13, 3.3)],  ## line 08
]

## add lines
for verts in line_coords:
    ax.add_patch(get_patch(verts))

## text coordinates
text_coord = [
    (5, 10.1), 
    (1, 7.23), (9, 7.23),
    (-1, 4.9), (3, 4.9), (7, 4.9), (11, 4.9),
    (9, 2.85), (13, 2.85)
]

## text label and size
text_label = [
    ("Data", 18), ("Qualitative\nData", 16.5), ("Quantitative\nData", 16.5),
    ("Nominal", 14.5), ("Ordinal", 14.5), ("Discrete", 14.5), ("Continuous", 14.5),
    ("Interval", 13.5), ("Ratio", 13.5)
    
]

## add text
for i in range(len(text_coord)):
    text = ax.text(
        text_coord[i][0], text_coord[i][1], text_label[i][0], color="#F2F2F2", size=text_label[i][1],
        bbox=dict(facecolor="none", edgecolor="#D3D3D3", boxstyle="round,pad=1"), zorder=2,
        ha="center", va="center"
    )

## credit
ax.text(
    13.7, 1.7, "graphic: @slothfulwave612", fontstyle="italic", 
    color="#F2F2F2", size=10, ha="right", va="center", alpha=0.8
)
    
## set axis
ax.set(xlim=(-2.25,13.75), ylim=(1.5,11))

## tidy axis
ax.axis("off")

plt.show()

Levels of Measurement

Defining The Term

  • The levels of measurement or scales of measurement is about how each variable is measured and how precise each variable is.

Why Is It Important?

  • Businesses gather millions of numerical data every day, representing lots of items.

  • For example, numbers represent dollar (i.e. costs of items produced), geographical locations of retail outlets, weights of shipments etc.

  • All such data should not be analysed the same way statistically because the entities represented by these numbers are different.

  • For this reason, business researchers need to know the level of data measurement represented by the numbers being analyzed.

  • To conclude: The researcher needs to understand the different levels of measurement, as these, together with how the research question is phrased, dictate what statistical analysis is appropriate.

Different Types of Data

  • Data is divided into two different types:

    1. Qualitative Data

    2. Quantitative Data

Qualitative Data

  • We define qualitative data as the data that characterizes something.

  • Also known as categorical data, this data type isn’t necessarily measured using numbers but rather categorized based on properties, attributes, labels, and other identifiers.

  • For example:-

    • Name of an individual, Gender, State of origin, citizenship.

    • A teacher gives the whole class an essay that was assessed by giving comments on spellings, grammar, and punctuation rather than score.

  • Qualitative data is further divided into two types:

    • Nominal

    • Ordinal

Nominal Data

  • We define nominal data as the data which don't have any order, i.e. we cannot put these data into a series such as from high to low or from fast to slow etc.

  • Some of the examples:

    • Hair colour (e.g. Blond, Brunette, Black etc).

    • Sex, religion, ethnicity.

    • Zipcodes, place of birth.

Ordinal Data

  • We define ordinal data as the data which have a defined order, i.e. we can put these data into a series such as from high to low or from fast to slow etc.

  • Some of the examples:

    • Satisfaction data points in a survey, where one = happy, two = neutral, and three = unhappy.

    • Results of a race is another example of ordinal data, where first place, second place or third place shows what order the runners finished in.

    • Order of class in a college (e.g. freshman, sophomore, junior and senior).

      Tip: One way to remember these two terms is: Ordinal starts with O which means order and Nominal starts with No which means no-order. So, data which have a defined order is Ordinal, and the data which do not have any order is Nominal.

Quantitative Data

  • We define quantitative data as the data whose value is measured in the form of numbers or counts.

  • Also known as numerical data, and is used to answer questions such as How many?, How often?, How much?.

  • For example:-

    • Height of a person, Weight of a person, Temperature, Cost of something etc.

    • High school grade points, number of pets, number of cousins you have etc.

  • Quantitative Data is further divided into two types:

    • Discrete

    • Continuous

Discrete Data

  • Discrete data is a whole number that cannot be divided or broken into individual parts, fractions and decimals.

  • Example of discrete data:

    • The number of siblings you have. You can have two siblings or three siblings but not two-and-a-half siblings.

    • The number of wins your favourite team gets is also a form of discrete data because a team can’t have a half win – it’s either a win, a loss, or a draw.

Continuous Data

  • Continuous data describes values that can be broken down into different parts, units, fractions and decimals.

  • Examples of continuous data:

    • Height and weight

    • Time - can also be broken down into half a second or half an hour

    • Temperature is another example

  • Continuous data is furthur divided into two pats:

    • Interval

    • Ratio

      Tip: Data is considered discrete if it can be counted and considered continuous if it can be measured.

Interval

  • Interval means space in between.

  • Interval scales (or interval data) are numeric scales in which we know both the order and the exact difference between the values.

  • The classic example of an interval scale is Celsius temperature because the difference between each value is the same. For example, the difference between 60 and 50 degrees is a measurable 10 degrees, as is the difference between 80 and 70 degrees.

  • The limitation with interval scales is they don't have a true zero.

    • Absolute/true zero means that the zero point represents the absence of the property being measured (e.g., €0 means no money, 0kg means no weight).
  • There is no such thing as no temperature, at least not with celsius. In the case of interval scales, zero doesn’t mean the absence of value but is another number used on the scale, like 0 degrees celsius.

  • Without a true zero, it is impossible to compute ratios.

  • With interval data, we can add and subtract, but cannot multiply or divide.

    • Consider this: 10°C + 10°C = 20°C or 60°C - 20°C = 40°C, that's easy.

    • But, 20°C is not twice as hot as 10°C, because when we convert 10°C to Fahrenheit, it is equal to 50°F and 20°C is equal to 68°C, which is not twice as hot.

  • So, to conclude, interval scales are great, but we cannot calculate ratios.

Ratio

  • Ratio scales provide rankings, assure equal differences between scale values, and have a true-zero point.

  • For example, the measurement of money is an example of a ratio scale. An individual with €0 has an absence of money. With a true-zero point, it would be correct to say that someone with €100 has twice as much money as someone with €50.

  • Other examples of ratio variables include height, weight, and duration.

  • Some variables, such as temperature, can be measured on different scales. While Celsius and Fahrenheit are interval scales, Kelvin is a ratio scale.

    • In all 3 scales, there are equal intervals between neighbouring points. However, unlike the Celsius and Fahrenheit scales where zero is just another temperature value, the Kelvin scale has a true zero (0 K) where nothing can be colder.

    • That means that you can only calculate ratios of temperatures in the Kelvin scale. Although 40° is twice as many degrees as 20°, it isn’t twice as hot on the Celsius or Fahrenheit scales. However, in the Kelvin scale, 40 K is twice as hot as 20 K because there is a true zero at the starting point of this scale.

Questionnaire

Ques 01: Why should we classify data into different levels of measurement?

Ques 02: Many changes continue to occur in the healthcare industry. Because of increased competition for patients among providers and the need to determine how providers can better serve their clientele, hospital administrators sometimes administer a quality satisfaction survey to their patients after the patient are released. They ask following types of questions on such a survey. These questions will result in what level of data measurement?

  2.1. How long ago were you released from the hospital?

  2.2. Which type of unit were you in for most of your stay?

    □ Coronary case

    □ Intensive case

    □ Maternity case

    □ Medical unit

    □ Pediatric/children's unit

    □ Surgical unit

  2.3. In choosing a hospital, how important was the hospital's location?

    □ Very Important, □ Somewhat Important, □ Not Very Important, □ Not At All Important

  2.4. How serious was your condition when you were first admitted to the hospital?

    □ Critical, □ Serious, □ Moderate, □ Minor

  2.5.Rate the skill of your doctor:

    □ Excellent, □ Very Good, □ Good, □ Fair, □ Poor

Ques 03: Classify each of the following as nominal, ordinal, interval or ratio data.

  3.1. The time required to produce each tire on an assembly line.

  3.2. The number of quarts of milk a family drinks in a month.

  3.3. The ranking of four machines in your plant after they have been designated as excellent, good, satisfactory, and poor.

  3.4. The telephone area code of clients in the United States.

  3.5. The age of each of your employees.

  3.6. The dollar sales at the local pizza shop each month.

  3.7. An employee's identification number.

  3.8. The response time of an emergency unit.

  3.9. The ranking of a company by Fortune 500.

3. If you face any problem or have any feedback/suggestions feel free to comment.