How to Solve Python AttributeError: ‘DataFrame’ object has no attribute ‘unique’

by | Programming, Python, Tips

A DataFrame is a two-dimensional, mutable tabular data structure like an Excel spreadsheet. If you want to find the unique values in a DataFrame using the method unique(), you must call the method on a Series object. If you try to call unique() on a DataFrame object, you will raise the AttributeError: ‘DataFrame’ object has no attribute ‘unique’.

You can also pass the series to the built-in pandas.unique() method, which is significantly faster than numpy.unique for long enough sequences.

This tutorial will go through how to solve this error with code examples.


AttributeError: ‘DataFrame’ object has no attribute ‘unique

AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part ‘DataFrame’ object has no attribute ‘unique’‘ tells us that the DataFrame object we are handling does not have the unique attribute. The unique() method is a Series attribute and a Pandas attribute. The next sections will describe the syntax of the two methods.

Series.unique()

The syntax of Series.unique() is as follows:

Series.unique

The method does not take any parameters and returns the unique values of a Series object as a NumPy array. The method uses a hash-table to return unique values and does not sort the values.

pandas.unique()

The syntax of pandas.unique() is as follows:

pandas.unique(values)

Parameters:

  • values: Required. 1D array-like

Returns:

  • NumPy.ndarray or ExtensionArray. The return can be:
    • Index: when the input is an index
    • Categorical: when the input is Categorical dtype
    • ndarray: when the input is a Series/ndarray

Example

Let’s look at an example where we have a DataFrame containing the players of a game and their scores.

impoprt pandas as pd

df = pd.DataFrame({'player_name':['Jim', 'Bob', 'Chris', 'Gerri', 'Lorraine', 'Azrael', 'Luke'], 'score':[9, 9, 4, 3, 1, 4, 6]})

print(df)
  player_name  score
0         Jim      9
1         Bob      9
2       Chris      4
3       Gerri      3
4    Lorraine      1
5      Azrael      4
6        Luke      6

We will try to get the unique scores by calling the unique() method on the DataFrame object.

# Attempt to get unique values of DataFrame

df = df.unique()

print(df)

Let’s run the code to see what happens:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-3e11b3d46b01> in <module>
----> 1 df = df.unique()
      2 print(df)

~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name)
   5581         ):
   5582             return self[name]
-> 5583         return object.__getattribute__(self, name)
   5584 
   5585     def __setattr__(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'unique'

The error occurs because the unique() method is not a DataFrame attribute.

Solution #1: Use Series.unique()

To solve this error, we can use Series.unique() using the score column.

# Check type of column

print(type([df'score'])

# Call unique() on Series

unique_scores = df['score'].unique()

# Print result

print(unique_scores)

Let’s run the code to see the result:

<class 'pandas.core.series.Series'>
[9 4 3 1 6]

There are five unique scores in the DataFrame.

Solution #2: Use pandas.unique()

We can also pass the Series object to the built-in pandas.unique() method to get the unique scores.

# Pass Series to built-in unique() method

unique_scores = pd.unique(df['score'])

print(unique_scores)

Let’s run the code to get the result:

[9 4 3 1 6]

Solution #3: Use groupby()

We can use the DataFrame groupby() method to group the DataFrame by counting each score in the ‘score’ column. We can get the count of each unique score using the count() method. Let’s look at the code:

# Group the DataFrame by counts of score values

unique_score_count = df.groupby('score').score.count()

print(unique_score_count)

print(type(unique_score_count))

Let’s run the code to get the result:

score
1    1
3    1
4    2
6    1
9    2
Name: score, dtype: int64
<class 'pandas.core.series.Series'>

The groupby() returns a Series object containing the counts of the unique scores in the score column of the DataFrame. We can see there are five unique scores.

Summary

Congratulations on reading to the end of this tutorial! The AttributeError: ‘DataFrame’ object has no attribute ‘unique’ occurs when you try to use the unique() method on a DataFrame instead of a Series. unique() is a Series attribute and is a built-in Pandas method, therefore you can either call the unique() method on the Series or pass a Series to the pandas.unique() method. You can also use groupby() to get the number of observations for each value in a column, indirectly giving you the unique values in a column.

For further reading on errors involving Pandas, go to the articles:

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!

Profile Picture
Senior Advisor, Data Science | [email protected] |  + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee ✨