Flexiple Logo
  1. Home
  2. Blogs
  3. Python
  4. Python count unique values in column

Python count unique values in column

Author image

Harsh Pandey

Software Developer

Published on Wed Mar 27 2024

Introduction

Welcome to our blog on counting unique values in a column using Python! If you've ever worked with data, you know that counting unique values is a common task that can reveal valuable insights. In this article, we'll walk you through various methods to efficiently count unique values in a column using Python, making your data analysis tasks a breeze. We'll cover techniques like Series.unique(), Series.nunique(), and value_counts().

Moreover, we'll explore how to handle multiple columns and even count unique values in each row. Let's dive in and unlock the power of Python to handle your data with ease!

Python count unique values in column using Series.unique()

The Series.unique() method helps us obtain a list of unique values from a specific column in our DataFrame. Let's assume we have a DataFrame named df with a column called "Courses" that contains various course names. To get the count of unique courses, we use the unique() method followed by size to count the number of elements in the resulting list. Here's how to do it:

# Example DataFrame
import pandas as pd

data = {'Courses': ['Math', 'Science', 'History', 'Math', 'Geography']}
df = pd.DataFrame(data)

# Get Unique Count using Series.unique()
count = df['Courses'].unique().size
print("Number of unique courses:", count)

In this example, the output will be Number of unique courses: 4, as we have four distinct courses in the "Courses" column.

Python count unique values in column using Series.nunique()

The Series.nunique() method is another handy way to count unique values in a column. Instead of calling the unique() method and then calculating the size, we can directly use nunique() to get the count of unique elements in the column. Here's how to do it:

# Example DataFrame (continuation from the previous example)
count = df['Courses'].nunique()
print("Number of unique courses:", count)

The output will be the same as before, Number of unique courses: 4. The nunique() method makes it more convenient to get the count of unique values in a column.

Python count unique values in column and frequency of each value

Sometimes, we may need to know how many times each unique value appears in a column. The value_counts() method comes in handy for this task. Let's take our previous DataFrame and find the frequency of each course:

# Example DataFrame (continuation from the previous example)
frequency = df['Courses'].value_counts()
print("Frequency of each course:\n", frequency)

Output:

Frequency of each course:
Math       2
Science    1
History    1
Geography  1
Name: Courses, dtype: int64

Using drop_duplicates() to remove duplicate rows from dataframe

The drop_duplicates() method allows us to remove duplicate rows from a DataFrame and obtain a new DataFrame without duplicates. We can then calculate the count of unique elements using the size attribute. Let's see how to do it:

# Example DataFrame (continuation from the previous example)
count = df['Courses'].drop_duplicates().size
print("Number of unique courses:", count)

The output will be Number of unique courses: 4, which is the same as before, as we removed duplicate rows before counting.

Python count unique values from multiple columns

Now, let's explore how to count unique values when considering multiple columns. In this example, we will use two columns: "Courses" and "Fee." We will combine the columns to create a new DataFrame, drop duplicate rows, and then calculate the number of unique rows in the resulting DataFrame:

# Example DataFrame (continuation from the previous example)
df_multi = df[['Courses', 'Fee']].drop_duplicates()
count = df_multi.shape[0]
print("Number of unique rows with 'Courses' and 'Fee':", count)

The output will be Number of unique rows with 'Courses' and 'Fee': 1, as both columns have the same data in all rows, and we dropped the duplicates.

Python count unique values in each row

In some cases, we might be interested in counting the number of unique values in each row. To achieve this, we can use the nunique() method along with axis=1. Let's see how it works:

# Example DataFrame (continuation from the previous example)
row_unique_counts = df.nunique(axis=1)
print("Number of unique values in each row:\n", row_unique_counts)

Output:

Number of unique values in each row:
0    1
1    1
2    1
3    1
4    1
dtype: int64

Since we have only one course in each row, the result is 1 for each row.

Conclusion

In this article, we have explored different methods to count unique values in a column using Python. We have covered techniques like Series.unique(), Series.nunique(), and value_counts(). Additionally, we have learned how to handle multiple columns and count unique values in each row using appropriate methods. By mastering these techniques, you can efficiently analyze data, gain valuable insights, and make informed decisions in your data-driven projects.

Related Blogs

Browse Flexiple's talent pool

Explore our network of top tech talent. Find the perfect match for your dream team.