How to Solve Python AttributeError: ‘numpy.ndarray’ object has no attribute ‘columns’

by | Data Science, Python, Tips

Introduction

When working with Python and manipulating data, you might often use libraries like NumPy and Pandas. However, combining these libraries or mistakenly using one in place of another can result in errors. One such common error is the following:

AttributeError: 'numpy.ndarray' object has no attribute 'columns'

This error arises when you are trying to access the columns attribute of a NumPy array, which does not possess this attribute, unlike a Pandas DataFrame. In this blog post, we will break down the reasons for this error and guide you through how to fix it with an easy-to-follow example.

Understanding the Error

The error message tells us that Python is trying to access the columns attribute of a NumPy ndarray. However, NumPy arrays do not have column labels, as they are primarily used for numerical computations rather than data labeling. This attribute, columns, belongs to Pandas DataFrames, which are structured to contain labeled data with row and column indices.

Here’s an example of the error in action:

import numpy as np

# Create a numpy array
array = np.array([[1, 2, 3], [4, 5, 6]])

# Attempt to access 'columns' attribute like a DataFrame
print(array.columns)
AttributeError: 'numpy.ndarray' object has no attribute 'columns'

In this code snippet, we try to access the columns attribute of a NumPy array, which triggers the AttributeError.

Why the Error Occurs

The primary reason for this error is confusing NumPy arrays with Pandas DataFrames. While both are useful for handling data, they serve different purposes:

  • NumPy arrays are mainly used for numerical operations.
  • Pandas DataFrames are used for labeled data, where rows and columns have names (metadata).

In this case, the solution is to convert the NumPy array into a Pandas DataFrame before accessing the columns attribute.

How to Fix the Error

To fix the error, you need to convert the NumPy array into a Pandas DataFrame using the pd.DataFrame() function. Once the array is in DataFrame form, you can access the columns attribute without encountering the AttributeError.

Here’s the corrected version of the above code:

import numpy as np
import pandas as pd

# Create a numpy array
array = np.array([[1, 2, 3], [4, 5, 6]])

# Convert NumPy array to a Pandas DataFrame
df = pd.DataFrame(array, columns=['A', 'B', 'C'])

# Now access the 'columns' attribute
print(df.columns)

Output:

Index(['A', 'B', 'C'], dtype='object')

In this updated code, we convert the NumPy array into a DataFrame and assign column labels ('A', 'B', 'C') to it. The columns attribute is now accessible without error.

Summary

The error AttributeError: 'numpy.ndarray' object has no attribute 'columns' occurs because NumPy arrays do not have a columns attribute. To resolve this error, convert the NumPy array into a Pandas DataFrame and then access the columns attribute.

For further reading on AttributeErrors involving NumPy ndarrays, go to the article:

Go to the online courses page on Python to learn more about coding in Python for data science and machine learning.

Have fun and happy researching!