A DataFrame is a two-dimensional, mutable tabular data structure like an Excel spreadsheet. If you want to find the unique values in a DataFrame using the method unique()
, you must call the method on a Series object. If you try to call unique()
on a DataFrame object, you will raise the AttributeError: ‘DataFrame’ object has no attribute ‘unique’.
You can also pass the series to the built-in pandas.unique()
method, which is significantly faster than numpy.unique
for long enough sequences.
This tutorial will go through how to solve this error with code examples.
Table of contents
AttributeError: ‘DataFrame’ object has no attribute ‘unique
AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part ‘DataFrame’ object has no attribute ‘unique’‘ tells us that the DataFrame object we are handling does not have the unique attribute. The unique()
method is a Series attribute and a Pandas attribute. The next sections will describe the syntax of the two methods.
Series.unique()
The syntax of Series.unique()
is as follows:
Series.unique
The method does not take any parameters and returns the unique values of a Series object as a NumPy array. The method uses a hash-table to return unique values and does not sort the values.
pandas.unique()
The syntax of pandas.unique()
is as follows:
pandas.unique(values)
Parameters:
- values: Required. 1D array-like
Returns:
- NumPy.ndarray or ExtensionArray. The return can be:
- Index: when the input is an index
- Categorical: when the input is Categorical dtype
- ndarray: when the input is a Series/ndarray
Example
Let’s look at an example where we have a DataFrame containing the players of a game and their scores.
impoprt pandas as pd df = pd.DataFrame({'player_name':['Jim', 'Bob', 'Chris', 'Gerri', 'Lorraine', 'Azrael', 'Luke'], 'score':[9, 9, 4, 3, 1, 4, 6]}) print(df)
player_name score 0 Jim 9 1 Bob 9 2 Chris 4 3 Gerri 3 4 Lorraine 1 5 Azrael 4 6 Luke 6
We will try to get the unique scores by calling the unique()
method on the DataFrame object.
# Attempt to get unique values of DataFrame df = df.unique() print(df)
Let’s run the code to see what happens:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-11-3e11b3d46b01> in <module> ----> 1 df = df.unique() 2 print(df) ~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name) 5581 ): 5582 return self[name] -> 5583 return object.__getattribute__(self, name) 5584 5585 def __setattr__(self, name: str, value) -> None: AttributeError: 'DataFrame' object has no attribute 'unique'
The error occurs because the unique()
method is not a DataFrame attribute.
Solution #1: Use Series.unique()
To solve this error, we can use Series.unique()
using the score
column.
# Check type of column print(type([df'score']) # Call unique() on Series unique_scores = df['score'].unique() # Print result print(unique_scores)
Let’s run the code to see the result:
<class 'pandas.core.series.Series'> [9 4 3 1 6]
There are five unique scores in the DataFrame.
Solution #2: Use pandas.unique()
We can also pass the Series object to the built-in pandas.unique()
method to get the unique scores.
# Pass Series to built-in unique() method unique_scores = pd.unique(df['score']) print(unique_scores)
Let’s run the code to get the result:
[9 4 3 1 6]
Solution #3: Use groupby()
We can use the DataFrame groupby() method to group the DataFrame by counting each score in the ‘score’ column. We can get the count of each unique score using the count() method. Let’s look at the code:
# Group the DataFrame by counts of score values unique_score_count = df.groupby('score').score.count() print(unique_score_count) print(type(unique_score_count))
Let’s run the code to get the result:
score 1 1 3 1 4 2 6 1 9 2 Name: score, dtype: int64 <class 'pandas.core.series.Series'>
The groupby()
returns a Series object containing the counts of the unique scores in the score column of the DataFrame. We can see there are five unique scores.
Summary
Congratulations on reading to the end of this tutorial! The AttributeError: ‘DataFrame’ object has no attribute ‘unique’ occurs when you try to use the unique()
method on a DataFrame instead of a Series. unique()
is a Series attribute and is a built-in Pandas method, therefore you can either call the unique() method on the Series or pass a Series to the pandas.unique()
method. You can also use groupby()
to get the number of observations for each value in a column, indirectly giving you the unique values in a column.
For further reading on errors involving Pandas, go to the articles:
- How to Solve Python AttributeError: ‘str’ object has no attribute ‘contains’.
- How to Solve Python AttributeError: ‘Series’ object has no attribute ‘lower’.
- How to Solve Python AttributeError: ‘DataFrame’ object has no attribute ‘as_matrix’
- How to Solve Pandas AttributeError: ‘DataFrame’ object has no attribute ‘str’
To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.