In Python, a Pandas Series is a one-dimensional labelled array capable of holding data of any type. Pandas Series is the same as a column in an Excel spreadsheet, and the Series class has a collection of vectorized string functions under str.

If you try to use one of the string functions like str.replace or str.split on a string object instead of a Series object, you will raise the AttributeError: ‘str’ has no attribute ‘str’

To use a Python string method on a string, you do not need to have str. before the method call, for example, string.str.split(",") should be string.split(",")

This tutorial will go through the error and how to solve it with code examples.


AttributeError: ‘str’ object has no attribute ‘str’

AttributeError occurs in a Python program when we try to access an attribute (method or property) that does not exist for a particular object. The part “‘str’ object has no attribute ‘str’” tells us that the string object we are handling does not have the attribute str. The str attribute belongs to the pandas.Series class and provides vectorized string functions for Series and Index objects based on Python’s built-in string methods.

Example

Let’s look at an example where we want to clean some data in a DataFrame. In the following code, we will define our DataFrame, which will contain employee IDs in the first column and the annual salary for each ID in the second column. We will then access each row in the DataFrame using iterrows. For each row, we will attempt to use the str.replace() function to clean the salaries of dollar signs ($) and commas (,). Lastly, we will attempt to convert the cleaned values to integers using astype(int). Let’s look at the code:

import pandas as pd

df = pd.DataFrame({'EmployeeID': ['12', '13', '15', '21'],
'Salary':['$36,000','$20,000', '$70,000', '$100,000' ]})

for idx, row in df.iterrows():

    row['Salary'] = row['Salary'].str.replace('$','').str.replace(',','').astype(int)

print(df)

Let’s run the code to see what happens:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-52-29cc66f4824b> in <module>
      5 
      6 for idx, row in df.iterrows():
----> 7     row['Salary'] = row['Salary'].str.replace('$','').str.replace(',','').astype(int)
      8 
      9 print(df)

AttributeError: 'str' object has no attribute 'str'

We get the AttributeError because row['Salary'] is a string, not a Series object. We can verify this using the type() method to check the type of the object:

import pandas as pd

df = pd.DataFrame({'EmployeeID': ['12', '13', '15', '21'],
'Salary':['$36,000','$20,000', '$70,000', '$100,000' ]})

for idx, row in df.iterrows():

    print(type(row['Salary']))
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>

We cannot access the Pandas string functions under str with a string object.

Solution #1: Use replace without str

To solve this error, we can use the Python string replace() method by removing the str. We will also convert the Salary values to integers by passing the string values to the int() function. Python strings do not have astype() as an attribute. Let’s look at the revised code:

import pandas as pd

df = pd.DataFrame({'EmployeeID': ['12', '13', '15', '21'],
'Salary':['$36,000','$20,000', '$70,000', '$100,000' ]})

for idx, row in df.iterrows():
    row['Salary'] = int(row['Salary'].replace('$','').replace(',',''))

print(df)

Let’s run the code to see the result.

  EmployeeID  Salary
0         12   36000
1         13   20000
2         15   70000
3         21  100000

Solution #2: Use str.replace on pandas.Series object

Using str.replace provides a concise way to edit the entire column of a DataFrame without iterating over the rows. When we access a column of a DataFrame by specifying the column name, we get a Series object. The column we want is df['Salary']. We can call str.replace on the object to remove the unwanted characters and call astype(int) to convert each value in the column to an integer. Let’s look at the revised code:

import pandas as pd

df = pd.DataFrame({'EmployeeID': ['12', '13', '15', '21'], 'Salary':['$36,000','$20,000', '$70,000', '$100,000' ]})

print(type(df['Salary']))

df['Salary'] = df['Salary'].str.replace('$','',regex=False).str.replace(',','',regex=False).astype(int)

print(df)

Let’s run the code to see the final result:

<class 'pandas.core.series.Series'>

  EmployeeID  Salary
0         12   36000
1         13   20000
2         15   70000
3         21  100000

Summary

Congratulations on reading to the end of this tutorial! The AttributeError ‘str’ object has no attribute ‘str’ occurs when you try to access the str attribute of a string object as if it were a Series object.

To solve this error, you can use Python string methods without the str.. Alternatively, you can use the pandas.Series.str functions on the Series object, which is likely to be a column in a DataFrame. It is helpful to print the object type before calling trying to access the Pandas string functions.

For further reading on the pandas Series class, go to the articles:

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!