A DataFrame is a two-dimensional, mutable tabular data structure like an Excel spreadsheet. If you want to use a string method on DataFrame, for example, using str.contains()
to check if a DataFrame contains a specific string, you have to use the string accessor attribute str on a column of the DataFrame. DataFrame does not have str as an attribute. If you try to use a string accessor method through .str
, you will raise the AttributeError: ‘DataFrame’ object has no attribute ‘str’.
To solve this error, you need to use a Series object with the .str
attribute. You can get a Series from a DataFrame by specifying the column name, for example, df['column']
. Or by using pandas.Series
, for, example pd.Series(df.values.flatten())
.
This tutorial will go through the error in detail and how to solve it with code examples.
AttributeError: ‘DataFrame’ object has no attribute ‘str’
AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part ‘DataFrame’ object has no attribute ‘str’‘ tells us that the DataFrame object we are handling does not have the str attribute. The .str
accessor provides vectorized string functions for Series and Index. The .str
accessor is a Series attribute, which means we can only access the string functions like str.replace()
or str.split()
when working with a Series object.
Example
Let’s look at an example where we want to filter out a row from a DataFrame containing a particular product. We will use a dataset containing the names of fruits and their quantities in a supermarket, and we will store the data in a CSV file called fruits.csv. Let’s look at the data:
fruit_type,qty orange,300 strawberry,500 melon,200
Next, we will import pandas and load the data into a DataFrame using read_csv. Then, we will attempt to use .loc to access the rows that contain the string “melon” and print the result to the console. Let’s look at the code:
import pandas as pd df = pd.read_csv('fruits.csv') melon_amount = df.loc[df.str.contains("melon")] print(melon_amount)
Let’s run the code to see what happens:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-6-ee9be87facac> in <module> 3 df = pd.read_csv('fruits.csv') 4 ----> 5 melon_amount = df.loc[df.str.contains("melon")] 6 7 print(melon_amount) ~/opt/anaconda3/lib/python3.8/site-packages/pandas/core/generic.py in __getattr__(self, name) 5581 ): 5582 return self[name] -> 5583 return object.__getattribute__(self, name) 5584 5585 def __setattr__(self, name: str, value) -> None: AttributeError: 'DataFrame' object has no attribute 'str'
The Python interpreter throws an AttributeError because we attempt to access the .str
attribute of the DataFrame object df. .str
is only an attribute of Series.
Solution
We need to select a Series to access the str attribute to solve this error. We can extract the fruit_type
column from the DataFrame by passing the column name to the indexing operator []
. The resultant column is a Series that we can call str.contains("melon")
. Let’s look at the revised code:
import pandas as pd df = pd.read_csv('fruits.csv') melon_amount = df.loc[df['fruit_type'].str.contains("melon")] print(melon_amount)
Let’s run the code to get the result:
fruit_type qty 2 melon 200
We successfully obtained the row containing the string melon
.
Summary
Congratulations on reading to the end of this tutorial! The AttributeError: ‘DataFrame’ object has no attribute ‘str’ occurs when you try to use the string accessor attribute .str to use string functions on a DataFrame. .str
is a Series attribute, therefore we need to use a DataFrame column instead of the entire DataFrame; for example: df['column_name'].str.contains(...)
.
For further reading on errors involving Pandas, go to the articles:
- How to Solve Python AttributeError: ‘str’ object has no attribute ‘contains’.
- How to Solve Python AttributeError: ‘Series’ object has no attribute ‘lower’.
- How to Solve Python AttributeError: ‘DataFrame’ object has no attribute ‘concat’
- How to Solve Python AttributeError: ‘DataFrame’ object has no attribute ‘unique’
To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.