If you try to call contains()
on a string like string.contains(value)
you will raise the AttributeError: ‘str’ object has no attribute ‘contains’.
The contains()
belongs to the pandas.Series
class. You can call str.contains
on a Series object.
To check if a substring exists in a string, you can use the in operator, for example, if value in string: ...
Table of contents
AttributeError: ‘str’ object has no attribute ‘contains
AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part “‘str’ object has no attribute ‘contains’” tells us that the string object we are handling does not have the contains attribute. The contains() method belongs to the pandas.Series
class and returns a boolean Series or index based on whether a given pattern or regex exists within a string of a Series or Index.
pandas.Series.str.contains
The syntax of str.contains
is as follows:
Series.str.contains(pattern, case, flags, na, regex)
Parameters
pattern
: Required. Character sequence or regular expression to search for.case
: Required. If True, the search is case sensitive. Default: True.flags
: Required. Flags to pass through to the re module, e.g.re.IGNORECASE
. Default: 0 (no flags).na
: Optional. Fill value for missing values. The default depends on dtype of the array. For object-dtype,numpy.nan
is used. ForStringDtype
,pandas.NA
is used.regex
: Required. If True assume the pattern is a regular expression. If False, treat the pattern as a literal string. Default: True.
Returns
A series or index of boolean values indicating whether the given pattern exists within the string of each element of the provided Series or Index.
Python String.__contains__
We can check if a substring is present in a string using the built-in __contains__()
method. Note that this is a different method from Series.str.contains()
. The syntax of the method is as follows
value = string.__contains__(substring)
Parameters
substring
: Required. The string pattern to check for membership.
Returns
A boolean value of True
if the substring exists in the string or False
if the substring does not exist in the string.
Python in operator
The in
operator invokes the __contains__()
method of an object. We can overload the __contains__()
method of a custom class. Let’s look at an example
class myClass(): def __init__(self, name): self.name = name # Overload __contains__ method def __contains__(self, substr): if substr in self.name: return True else: return False obj = myClass("python") print('python' in obj) print('Python' in obj)
Let’s run the code to see what happens:
True False
Note that the __contains__
method is case sensitive. Typically, as Python developers we do not use the underlying __contains__()
method, instead we use the in
operator. We can use the in operator with an if statement to create code blocks based on if a substring exists in a string.
Example
Let’s look at an example where we want to check if a column in a DataFrame contains a particular substring. First we will look at our data which is a list of pizza names and prices in a .csv file.
pizza,price margherita,£7.99 pepperoni,£8.99 four cheeses,£10.99 funghi,£8.99
We will call the file pizzas.csv
. Next we will load the data into our program using pandas
. Let’s look at the code:
import pandas as pd pizza_data = pd.read_csv('pizzas.csv')
Then we will iterate over the rows of the DataFrame and check if the pizza name contains “pepperoni
“, and if it does we print the price of the pizza.
for idx, row in pizza_data.iterrows(): if(row['pizza'].contains('pepperoni')): print(row['price'])
Let’s run the code to see what happens:
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-3-3c42fe7ca23b> in <module> 1 for idx, row in pizza_data.iterrows(): ----> 2 if(row['pizza'].contains('pepperoni')): 3 print(row['price']) 4 AttributeError: 'str' object has no attribute 'contains'
The error occurs because row is a Series object and row['pizza']
is a string object. The contains()
method is not an attribute of the built-in string class. We can verify the type of row and row['pizza']
as follows.
for idx, row in pizza_data.iterrows(): print(type(row)) print(type(row['pizza']))
<class 'pandas.core.series.Series'> <class 'str'> <class 'pandas.core.series.Series'> <class 'str'> <class 'pandas.core.series.Series'> <class 'str'> <class 'pandas.core.series.Series'> <class 'str'>
Solution
To solve this error we need to use the in operator to check for membership in the string. Let’s look at the revised code:
for idx, row in pizza_data.iterrows(): if 'pepperoni' in row['pizza']: print(row['price'])
Let’s run the code to see the result:
£8.99
Alternatively, we can call the str.contains
method on each row in the DataFrame. As shown above, each row returned by pizza_data.iterrows
is a Series object.
for idx, row in pizza_data.iterrows(): if any(row.str.contains('pepperoni')): print(row['price'])
£8.99
Note that in this implementation we have to pass the return value from contains()
to the any()
method because there are several ways to evaluate the Series object in the Boolean context.
The Boolean evaluation of this Series object is ambiguous because the Series object has more than one element. The Python interpreter could return True if all elements in the Series return True or if any of the elements in the Series return True. As the pizza name can only exist in the pizza
column, we will use any()
.
For more information on using any()
go to the article: How to Solve Python ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
Summary
Congratulations on reading to the end of this tutorial! The AttributeError ‘str’ object has no attribute ‘contains’ occurs when you try to call the contains()
method on a string object as if it were a Series object. To solve this error, you can use the in operator to check for membership in a string. Or call the str.contains()
method on the Series instead of a string value in the Series. It is helpful to print the type of the object before calling the contains method.
For further reading on pandas Series, go to the articles:
- How to Solve Python AttributeError: ‘Series’ object has no attribute ‘split’
- How to Solve Python AttributeError: ‘Series’ object has no attribute ‘reshape’
- How to Solve Python AttributeError: ‘str’ object has no attribute ‘str’
To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.