How to Solve Python NameError: name ‘pd’ is not defined

by | Programming, Python, Tips

This error typically occurs when you try to use the Pandas library but do not define the alias pd when importing the module. You can solve this error by using the as keyword to alias the pandas module, for example:

import pandas as pd

This tutorial will go through how to solve this error with code examples.


NameError: name ‘pd’ is not defined

Python raises the NameError when it cannot recognise a name in our program. In other words, the name we are trying to use is not defined in the local or global scope. A name can be related to a built-in function, module, or something we define in our programs, like a variable or a function.

The error typically arises when:

  • We misspell a name
  • We do not define a variable or function
  • We do not import a module

In this tutorial, the source of the error NameError: name ‘pd’ is not defined is due to either not aliasing or incorrectly aliasing the pandas module. Let’s look at an example.

Example

Let’s look at an example of creating a DataFrame using the pandas library. First, we must have pandas installed. You can go to the following article to learn how to install pandas for your operating system: How to Solve Python ModuleNotFoundError: no module named ‘pandas’.

Once we have pandas installed, we can create a DataFrame as follows:

import pandas

df = pd.DataFrame(
    {
        "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
        "price":[8.99, 9.99, 10.99, 7.99, 11.99]
    }
)

print(df)

Let’s run the code to see what happens:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [1], in <cell line: 3>()
      1 import pandas
----> 3 df = pd.DataFrame(
      4     {
      5         "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
      6         "price":[8.99, 9.99, 10.99, 7.99, 11.99]
      7     }
      8 )
     10 print(df)

NameError: name 'pd' is not defined

The error occurs because we installed pandas but did not correctly alias the module as pd. Therefore, the name pd is not defined and we cannot access the DataFrame class.

Solution #1: Use the as keyword

The easiest way to solve this error is to use the as keyword to create the alias pd. Let’s look at the updated code:

import pandas as pd

df = pd.DataFrame(
    {
        "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
        "price":[8.99, 9.99, 10.99, 7.99, 11.99]
    }
)

print(df)

Let’s run the code to get the DataFrame:

pizza  price
0   margherita   8.99
1    pepperoni   9.99
2     hawaiian  10.99
3     marinara   7.99
4  four cheese  11.99

Solution #2: Do not use aliasing

We can also solve this error by removing the alias and using the full name of the module. Let’s look at the revised code:

import pandas 

df = pandas.DataFrame(
    {
        "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
        "price":[8.99, 9.99, 10.99, 7.99, 11.99]
    }
)

print(df)

Let’s run the code to get the DataFrame:

pizza  price
0   margherita   8.99
1    pepperoni   9.99
2     hawaiian  10.99
3     marinara   7.99
4  four cheese  11.99

Solution #3: Use the from keyword

We can also use the from keyword to import a specific variable, class, or function from a module. In this case, we want to import the DataFrame class from the pandas module. Using the from keyword means we do not have to specify the module in the rest of the program, we only need to call the DataFrame method. Let’s look at the revised code:

from pandas import DataFrame 

df = DataFrame(
    {
        "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
        "price":[8.99, 9.99, 10.99, 7.99, 11.99]
    }
)

print(df)

Let’s run the code to get the DataFrame:

       pizza  price
0   margherita   8.99
1    pepperoni   9.99
2     hawaiian  10.99
3     marinara   7.99
4  four cheese  11.99

Using the from keyword can help make programs more concise and readable. If you want to import more than one class or function from the pandas module you can use commas between the imports. For example:

from pandas import DataFrame, concat

df = DataFrame(
    {
        "pizza": ['margherita', 'pepperoni', 'hawaiian', 'marinara', 'four cheese'],
        "price":[8.99, 9.99, 10.99, 7.99, 11.99]
    }
)

df2 = DataFrame(
    {
        "pizza": ['parmigiana', 'tartufo', 'funghi'],
        "price":[11.99, 12.99, 9.99]
    }
)

result = concat([df, df2], axis=0)

print(result)

However, the most common use of pandas is to import and alias the module and access the classes or methods when needed in the program using pd..

Summary

Congratulations on reading to the end of this tutorial.

For further reading on errors involving NameErrors, go to the articles:

Go to the online courses page on Python to learn more about Python for data science and machine learning.

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!

Follow the Research Scientist Pod on Social media!