Pandas Series.value_counts()
0 188
Introduction to pandas Series.value_counts()
The value_counts()
method in pandas Series is a powerful tool for quickly determining the frequency of unique values within a dataset. Whether you're conducting exploratory data analysis or preparing data for machine learning, this function provides essential insights into the distribution of your data.
What is Series.value_counts()?
The value_counts()
function returns a Series containing counts of unique values in the original Series. By default, it sorts the result in descending order, displaying the most frequent values first. This method is particularly useful for understanding categorical data distributions.
Syntax
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
normalize
: If True, returns the relative frequencies of the unique values.sort
: If True, sorts the values by frequency.ascending
: If True, sorts in ascending order.bins
: For numeric data, groups the values into half-open bins.dropna
: If True, excludes NA/null values.
Example 1: Counting Unique String Values
import pandas as pd
sr = pd.Series(['New York', 'Chicago', 'Toronto', 'Lisbon', 'Rio', 'Chicago', 'Lisbon'])
print(sr.value_counts())
Output:
Chicago 2
Lisbon 2
New York 1
Rio 1
Toronto 1
Example 2: Counting Numeric Values with NaN
import pandas as pd
sr = pd.Series([100, 214, 325, 88, None, 325, None, 325, 100])
print(sr.value_counts())
Output:
325.0 3
100.0 2
214.0 1
88.0 1
Handling Missing Values
By default, value_counts()
excludes NaN
values. To include them, set dropna=False
:
sr.value_counts(dropna=False)
This will count the occurrences of NaN
values as well.
Normalizing the Counts
To get the relative frequencies instead of absolute counts, set normalize=True
:
sr.value_counts(normalize=True)
This will return the proportion of each unique value in the Series.
Sorting the Counts
By default, value_counts()
sorts the counts in descending order. To sort in ascending order, set ascending=True
:
sr.value_counts(ascending=True)
This will display the least frequent values first.
Binning Numeric Data
For numeric data, you can group values into bins using the bins
parameter:
sr.value_counts(bins=3)
This will categorize the data into 3 equal-width bins and count the occurrences in each bin.
Conclusion
The value_counts()
method in pandas Series is an essential function for summarizing and understanding the distribution of data. Its versatility in handling different data types and its various parameters make it a valuable tool for data analysis tasks.
If you’re passionate about building a successful blogging website, check out this helpful guide at Coding Tag – How to Start a Successful Blog. It offers practical steps and expert tips to kickstart your blogging journey!
For dedicated UPSC exam preparation, we highly recommend visiting www.iasmania.com. It offers well-structured resources, current affairs, and subject-wise notes tailored specifically for aspirants. Start your journey today!

Share:
Comments
Waiting for your comments