If you try to call contains() on a string like string.contains(value) you will raise the AttributeError: ‘str’ object has no attribute ‘contains’.

The contains() belongs to the pandas.Series class. You can call str.contains on a Series object.

To check if a substring exists in a string, you can use the in operator, for example, if value in string: ...


AttributeError: ‘str’ object has no attribute ‘contains

AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part “‘str’ object has no attribute ‘contains’” tells us that the string object we are handling does not have the contains attribute. The contains() method belongs to the pandas.Series class and returns a boolean Series or index based on whether a given pattern or regex exists within a string of a Series or Index.

pandas.Series.str.contains

The syntax of str.contains is as follows:

Series.str.contains(pattern, case, flags, na, regex)

Parameters

  • pattern: Required. Character sequence or regular expression to search for.
  • case: Required. If True, the search is case sensitive. Default: True.
  • flags: Required. Flags to pass through to the re module, e.g. re.IGNORECASE. Default: 0 (no flags).
  • na: Optional. Fill value for missing values. The default depends on dtype of the array. For object-dtype, numpy.nan is used. For StringDtypepandas.NA is used.
  • regex: Required. If True assume the pattern is a regular expression. If False, treat the pattern as a literal string. Default: True.

Returns

A series or index of boolean values indicating whether the given pattern exists within the string of each element of the provided Series or Index.

Python String.__contains__

We can check if a substring is present in a string using the built-in __contains__() method. Note that this is a different method from Series.str.contains(). The syntax of the method is as follows

value = string.__contains__(substring)

Parameters

substring: Required. The string pattern to check for membership.

Returns

A boolean value of True if the substring exists in the string or False if the substring does not exist in the string.

Python in operator

The in operator invokes the __contains__() method of an object. We can overload the __contains__() method of a custom class. Let’s look at an example

class myClass():

    def __init__(self, name):

        self.name = name

    # Overload __contains__ method 
    def __contains__(self, substr):

        if substr in self.name:

            return True

        else:

            return False

obj = myClass("python")

print('python' in obj)

print('Python' in obj)

Let’s run the code to see what happens:

True
False

Note that the __contains__ method is case sensitive. Typically, as Python developers we do not use the underlying __contains__() method, instead we use the in operator. We can use the in operator with an if statement to create code blocks based on if a substring exists in a string.

Example

Let’s look at an example where we want to check if a column in a DataFrame contains a particular substring. First we will look at our data which is a list of pizza names and prices in a .csv file.

pizza,price
margherita,£7.99
pepperoni,£8.99
four cheeses,£10.99
funghi,£8.99

We will call the file pizzas.csv. Next we will load the data into our program using pandas. Let’s look at the code:

import pandas as pd

pizza_data = pd.read_csv('pizzas.csv')

Then we will iterate over the rows of the DataFrame and check if the pizza name contains “pepperoni“, and if it does we print the price of the pizza.

for idx, row in pizza_data.iterrows():

   if(row['pizza'].contains('pepperoni')):

       print(row['price'])

Let’s run the code to see what happens:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-3c42fe7ca23b> in <module>
      1 for idx, row in pizza_data.iterrows():
----> 2    if(row['pizza'].contains('pepperoni')):
      3        print(row['price'])
      4 

AttributeError: 'str' object has no attribute 'contains'

The error occurs because row is a Series object and row['pizza'] is a string object. The contains() method is not an attribute of the built-in string class. We can verify the type of row and row['pizza'] as follows.

for idx, row in pizza_data.iterrows():

    print(type(row))

    print(type(row['pizza']))
<class 'pandas.core.series.Series'>
<class 'str'>
<class 'pandas.core.series.Series'>
<class 'str'>
<class 'pandas.core.series.Series'>
<class 'str'>
<class 'pandas.core.series.Series'>
<class 'str'>

Solution

To solve this error we need to use the in operator to check for membership in the string. Let’s look at the revised code:

for idx, row in pizza_data.iterrows():

   if 'pepperoni' in row['pizza']:

       print(row['price'])

Let’s run the code to see the result:

£8.99

Alternatively, we can call the str.contains method on each row in the DataFrame. As shown above, each row returned by pizza_data.iterrows is a Series object.

for idx, row in pizza_data.iterrows():

    if any(row.str.contains('pepperoni')):

        print(row['price'])
£8.99

Note that in this implementation we have to pass the return value from contains() to the any() method because there are several ways to evaluate the Series object in the Boolean context.

The Boolean evaluation of this Series object is ambiguous because the Series object has more than one element. The Python interpreter could return True if all elements in the Series return True or if any of the elements in the Series return True. As the pizza name can only exist in the pizza column, we will use any().

For more information on using any() go to the article: How to Solve Python ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

Summary

Congratulations on reading to the end of this tutorial! The AttributeError ‘str’ object has no attribute ‘contains’ occurs when you try to call the contains() method on a string object as if it were a Series object. To solve this error, you can use the in operator to check for membership in a string. Or call the str.contains() method on the Series instead of a string value in the Series. It is helpful to print the type of the object before calling the contains method.

For further reading on pandas Series, go to the articles:

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!