Frequency Distributions

 

๐Ÿ“Š What is a Frequency Distribution?

A frequency distribution is a summary that shows how often each value (or range of values) appears in a dataset.

In simple terms, it counts how many times each data point or group of data points occurs.


๐Ÿ” Why is it Important?

Frequency distributions help in:

  • Understanding patterns in the data.

  • Identifying outliers, peaks, and gaps.

  • Summarizing large datasets in a meaningful way.

  • Preparing for statistical analysis or visualizations like histograms and bar charts.


๐Ÿงฎ Example: Simple Frequency Distribution

Suppose we survey 10 students about the number of books they read last month:


Data: [2, 3, 2, 5, 4, 2, 3, 4, 3, 5]

➡ Frequency Table:

Number of Books    Frequency
2    3
3    3
4    2
5    2

๐Ÿ“ฆ Types of Frequency Distributions

1. Ungrouped Frequency Distribution

  • Used for small datasets with individual values.

  • Example: Counting the frequency of each number in a test score list.

2. Grouped Frequency Distribution

  • Used for larger datasets, where data is grouped into intervals.

  • Example:

Suppose these are ages of 20 people:


[18, 19, 21, 20, 22, 23, 25, 26, 30, 35, 31, 28, 27, 24, 25, 29, 30, 33, 34, 36]

➡ Grouped Frequency Table (age intervals of 5 years):

Age GroupFrequency
18 - 225
23 - 276
28 - 325
33 - 374

๐Ÿงช Python Code Example (Frequency Table)


from collections import Counter data = [2, 3, 2, 5, 4, 2, 3, 4, 3, 5] freq = Counter(data) # Displaying frequency table print("Value | Frequency") for value, count in sorted(freq.items()): print(f"{value:^5} | {count:^9}")

๐Ÿ“ˆ Visualization with Histogram


import matplotlib.pyplot as plt plt.hist(data, bins=[1.5, 2.5, 3.5, 4.5, 5.5], edgecolor='black') plt.title("Histogram of Books Read") plt.xlabel("Number of Books") plt.ylabel("Frequency") plt.show()

๐Ÿ“Œ Significance in Statistics

PurposeExplanation
Data SummarizationMakes large, raw data easy to interpret
Detecting PatternsHelps visualize trends, e.g., most common range of values
Basis for ChartsUsed to create histograms, bar charts, pie charts, etc.
Foundation for Descriptive StatsHelps in calculating measures like mean, median, mode, variance, etc.
Decision MakingAssists researchers in drawing conclusions or planning strategies

๐ŸŽ“ Summary

  • A frequency distribution shows how often values occur in a dataset.

  • It's a crucial first step in any statistical analysis.

  • It helps in creating visuals and understanding data behavior.

๐Ÿ“˜ What is a Continuous Frequency Distribution?

A continuous frequency distribution is used when the data values are from a continuous variable, meaning they can take any value within a given range, not just specific, separate numbers.

๐Ÿง  In Simple Terms:

Instead of counting how many times a specific number occurs (like 10, 20, 30), we count how many values fall into a range, like:

  • 150–155 cm

  • 155–160 cm

  • 160–165 cm

These ranges are called class intervals.


๐Ÿ”ข Example:

Suppose we have the following heights of 20 students in cm:


[152, 155, 158, 160, 162, 165, 168, 169, 170, 172, 173, 174, 175, 176, 177, 179, 180, 182, 183, 185]

We can create a continuous frequency distribution like this:

Height (cm)Frequency
150 – 1552
155 – 1602
160 – 1652
165 – 1703
170 – 1755
175 – 1804
180 – 1852

๐Ÿ“˜ What are Exclusive Classes?

In a continuous frequency distribution, we often use exclusive class intervals.

Exclusive Class Intervals:

  • The lower boundary is included, but the upper boundary is excluded.

  • Written as: [a – b) → include a, exclude b

Example: [150 – 155) includes 150, 151, 152, ..., 154.999 but not 155

This avoids overlapping between class intervals and ensures each value belongs to only one class.

๐Ÿ” Why Use Exclusive Classes?

  • To avoid ambiguity.

  • Especially useful in continuous data, like time or height.

  • Most statistical software and textbooks prefer exclusive classes.


๐Ÿ“Š Comparison: Inclusive vs Exclusive Classes

FeatureInclusiveExclusive
Interval Example150–155 (includes 150 & 155)150–155 (includes 150, excludes 155)
Used inDiscrete data (age, scores)Continuous data (height, weight)
Overlapping Possible?YesNo
Common in School Data?YesNo

๐Ÿงช Python Example for Frequency Table (Using Exclusive Classes)


import pandas as pd # Sample height data data = [152, 155, 158, 160, 162, 165, 168, 169, 170, 172, 173, 174, 175, 176, 177, 179, 180, 182, 183, 185] # Define class intervals (bins) bins = [150, 155, 160, 165, 170, 175, 180, 185] # Create frequency table freq_table = pd.cut(data, bins=bins, right=False).value_counts().sort_index() # Display result print("Class Interval | Frequency") for interval, count in freq_table.items(): print(f"{interval} | {count}")

This will generate exclusive intervals like [150, 155), [155, 160), etc.


๐Ÿ“ Summary

TermMeaning
Continuous Frequency DistributionGroups continuous data into class intervals
Exclusive ClassesIncludes lower bound, excludes upper bound; avoids overlap
Why it's importantEssential for accurate analysis of real-world data like heights, time



Comments

Popular posts from this blog

GNEST305 Introduction to Artificial Intelligence and Data Science KTU BTech S3 2024 Scheme - Dr Binu V P

Basics of Machine Learning

Types of Machine Learning Systems