Introduction Python count unique values in column using Series.unique()Python count unique values in column using Series.nunique()Python count unique values in column and frequency of each value Using drop_duplicates() to remove duplicate rows from dataframe Python count unique values from multiple columns Python count unique values in each row Conclusion

Home

Blogs

Python

Python count unique values in column

Python count unique values in column

Harsh Pandey

Software Developer

Published on Wed Mar 27 2024

Introduction

Welcome to our blog on counting unique values in a column using Python! If you've ever worked with data, you know that counting unique values is a common task that can reveal valuable insights. In this article, we'll walk you through various methods to efficiently count unique values in a column using Python, making your data analysis tasks a breeze. We'll cover techniques like Series.unique(), Series.nunique(), and value_counts().

Moreover, we'll explore how to handle multiple columns and even count unique values in each row. Let's dive in and unlock the power of Python to handle your data with ease!

Python count unique values in column using `Series.unique()`

The Series.unique() method helps us obtain a list of unique values from a specific column in our DataFrame. Let's assume we have a DataFrame named df with a column called "Courses" that contains various course names. To get the count of unique courses, we use the unique() method followed by size to count the number of elements in the resulting list. Here's how to do it:

# Example DataFrame
import pandas as pd

data = {'Courses': ['Math', 'Science', 'History', 'Math', 'Geography']}
df = pd.DataFrame(data)

# Get Unique Count using Series.unique()
count = df['Courses'].unique().size
print("Number of unique courses:", count)

In this example, the output will be Number of unique courses: 4, as we have four distinct courses in the "Courses" column.

Python count unique values in column using `Series.nunique()`

The Series.nunique() method is another handy way to count unique values in a column. Instead of calling the unique() method and then calculating the size, we can directly use nunique() to get the count of unique elements in the column. Here's how to do it:

# Example DataFrame (continuation from the previous example)
count = df['Courses'].nunique()
print("Number of unique courses:", count)

The output will be the same as before, Number of unique courses: 4. The nunique() method makes it more convenient to get the count of unique values in a column.

Python count unique values in column and frequency of each value

Sometimes, we may need to know how many times each unique value appears in a column. The value_counts() method comes in handy for this task. Let's take our previous DataFrame and find the frequency of each course:

# Example DataFrame (continuation from the previous example)
frequency = df['Courses'].value_counts()
print("Frequency of each course:\n", frequency)

Output:

Frequency of each course:
Math       2
Science    1
History    1
Geography  1
Name: Courses, dtype: int64

Using `drop_duplicates()` to remove duplicate rows from dataframe

The drop_duplicates() method allows us to remove duplicate rows from a DataFrame and obtain a new DataFrame without duplicates. We can then calculate the count of unique elements using the size attribute. Let's see how to do it:

# Example DataFrame (continuation from the previous example)
count = df['Courses'].drop_duplicates().size
print("Number of unique courses:", count)

The output will be Number of unique courses: 4, which is the same as before, as we removed duplicate rows before counting.

Python count unique values from multiple columns

Now, let's explore how to count unique values when considering multiple columns. In this example, we will use two columns: "Courses" and "Fee." We will combine the columns to create a new DataFrame, drop duplicate rows, and then calculate the number of unique rows in the resulting DataFrame:

# Example DataFrame (continuation from the previous example)
df_multi = df[['Courses', 'Fee']].drop_duplicates()
count = df_multi.shape[0]
print("Number of unique rows with 'Courses' and 'Fee':", count)

The output will be Number of unique rows with 'Courses' and 'Fee': 1, as both columns have the same data in all rows, and we dropped the duplicates.

Python count unique values in each row

In some cases, we might be interested in counting the number of unique values in each row. To achieve this, we can use the nunique() method along with axis=1. Let's see how it works:

# Example DataFrame (continuation from the previous example)
row_unique_counts = df.nunique(axis=1)
print("Number of unique values in each row:\n", row_unique_counts)

Output:

Number of unique values in each row:
0    1
1    1
2    1
3    1
4    1
dtype: int64

Since we have only one course in each row, the result is 1 for each row.

Conclusion

In this article, we have explored different methods to count unique values in a column using Python. We have covered techniques like Series.unique(), Series.nunique(), and value_counts(). Additionally, we have learned how to handle multiple columns and count unique values in each row using appropriate methods. By mastering these techniques, you can efficiently analyze data, gain valuable insights, and make informed decisions in your data-driven projects.

Related Blogs

rstrip vs strip in Python: Trimming whitespaces made easy

Harsh Pandey

7min read

Python Rules Engine: Mastering Decision-Making with Code

Harsh Pandey

3min read

Get the Last Character of a String in Python

Harsh Pandey

2min read

How to Draw a Circle in Python?

Harsh Pandey

4min read

Intersection of Two Lists in Python

Harsh Pandey

2min read

Understanding dotenv in Python

Harsh Pandey

3min read

Browse Flexiple's talent pool

Explore our network of top tech talent. Find the perfect match for your dream team.