Pandas Series.value_counts()
×


Pandas Series.value_counts()

188

Introduction to pandas Series.value_counts()

The value_counts() method in pandas Series is a powerful tool for quickly determining the frequency of unique values within a dataset. Whether you're conducting exploratory data analysis or preparing data for machine learning, this function provides essential insights into the distribution of your data.

What is Series.value_counts()?

The value_counts() function returns a Series containing counts of unique values in the original Series. By default, it sorts the result in descending order, displaying the most frequent values first. This method is particularly useful for understanding categorical data distributions.

Syntax

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
  • normalize: If True, returns the relative frequencies of the unique values.
  • sort: If True, sorts the values by frequency.
  • ascending: If True, sorts in ascending order.
  • bins: For numeric data, groups the values into half-open bins.
  • dropna: If True, excludes NA/null values.

Example 1: Counting Unique String Values

import pandas as pd
sr = pd.Series(['New York', 'Chicago', 'Toronto', 'Lisbon', 'Rio', 'Chicago', 'Lisbon'])
print(sr.value_counts())

Output: Chicago 2
Lisbon 2
New York 1
Rio 1
Toronto 1

Example 2: Counting Numeric Values with NaN

import pandas as pd
sr = pd.Series([100, 214, 325, 88, None, 325, None, 325, 100])
print(sr.value_counts())

Output: 325.0 3
100.0 2
214.0 1
88.0 1

Handling Missing Values

By default, value_counts() excludes NaN values. To include them, set dropna=False:

sr.value_counts(dropna=False)

This will count the occurrences of NaN values as well.

Normalizing the Counts

To get the relative frequencies instead of absolute counts, set normalize=True:

sr.value_counts(normalize=True)

This will return the proportion of each unique value in the Series.

Sorting the Counts

By default, value_counts() sorts the counts in descending order. To sort in ascending order, set ascending=True:

sr.value_counts(ascending=True)

This will display the least frequent values first.

Binning Numeric Data

For numeric data, you can group values into bins using the bins parameter:

sr.value_counts(bins=3)

This will categorize the data into 3 equal-width bins and count the occurrences in each bin.

Conclusion

The value_counts() method in pandas Series is an essential function for summarizing and understanding the distribution of data. Its versatility in handling different data types and its various parameters make it a valuable tool for data analysis tasks.



If you’re passionate about building a successful blogging website, check out this helpful guide at Coding Tag – How to Start a Successful Blog. It offers practical steps and expert tips to kickstart your blogging journey!

For dedicated UPSC exam preparation, we highly recommend visiting www.iasmania.com. It offers well-structured resources, current affairs, and subject-wise notes tailored specifically for aspirants. Start your journey today!



Best WordPress Hosting


Share:


Discount Coupons

Get a .COM for just $6.98

Secure Domain for a Mini Price



Leave a Reply


Comments
    Waiting for your comments

Coding Tag WhatsApp Chat