How to Solve Python ValueError: cannot set a row with mismatched columns

by | Programming, Python, Tips

This error occurs when you try to add a new row to a DataFrame but the number of values does not match the number of columns in the existing DataFrame.

You can solve this error by ensuring the number of values in the new row matches the number of columns in the DataFrame or by using the append() method.

This tutorial will go through the error in detail and how to solve it with code examples.


Example

Let’s look at an example to reproduce the error. First, we will create a DataFrame containing the grades of nine students for three subjects.

import pandas as pd

# Create DataFrame

df = pd.DataFrame({'student': ['john', 'calogero', 'amina', 'clemence', 'george', 'phil', 'albert', 'lizzy', 'paul'],
                   'biology': [74, 55, 80, 60, 40, 77, 51, 90, 34],
                   'chemistry': [59, 71, 72, 90, 66, 89, 59, 34, 84],
                   'physics': [100, 58, 70, 64, 58, 75, 91, 72, 49]})

# View the DataFrame

print(df)

Let’s run the code to see the DataFrame:

    student  biology  chemistry  physics
0      john       74         59      100
1  calogero       55         71       58
2     amina       80         72       70
3  clemence       60         90       64
4    george       40         66       58
5      phil       77         89       75
6    albert       51         59       91
7     lizzy       90         34       72
8      paul       34         84       49

Next, we will attempt to append a new to the end of the DataFrame.

# Define new row

new_student = ['Carmine', 85]

# Append row to DataFrame

df.loc[len(df)] = new_student

# Print updated DataFrame to console

print(df)

Let’s run the code to see what happens:

ValueError: cannot set a row with mismatched columns

The error occurs because the new row only contains two values whereas the DataFrame has four columns. We can verify the number of values in the list and the number of columns in a DataFrame using the len() function. For example,

print(len(new_student))

print(len(df.columns))
2
4

Solution #1

The easiest way to solve the error is to ensure that the number of values in the new row match the number of columns in the DataFrame. The student is missing two grades for chemistry and physics. Let’s look at the revised code:

new_student = ['carmine', 85, 58, 93]

df.loc[len(df)] = new_student

print(df)

Let’s run the code to see the result:

    student  biology  chemistry  physics
0      john       74         59      100
1  calogero       55         71       58
2     amina       80         72       70
3  clemence       60         90       64
4    george       40         66       58
5      phil       77         89       75
6    albert       51         59       91
7     lizzy       90         34       72
8      paul       34         84       49
9   carmine       85         58       93

We successfully appended the new row to the DataFrame.

Solution #2

We can also solve the error by using the append() function. The append() function will automatically fill in the missing values with NaN.

Let’s look at the revised code:

# Define new row to append

new_student = ['carmine', 85]

# Append row to end of DataFrame

df = df.append(pd.Series(new_student, index=df.columns[:len(new_student)]), ignore_index=True)

Let’s run the code to get the updated DataFrame:

    student  biology  chemistry  physics
0      john       74       59.0    100.0
1  calogero       55       71.0     58.0
2     amina       80       72.0     70.0
3  clemence       60       90.0     64.0
4    george       40       66.0     58.0
5      phil       77       89.0     75.0
6    albert       51       59.0     91.0
7     lizzy       90       34.0     72.0
8      paul       34       84.0     49.0
9   carmine       85        NaN      NaN

Summary

Congratulations on reading to the end of this tutorial!

For further reading on errors involving Pandas, go to the articles:

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!