Select Page

How to Solve Python ValueError: Columns overlap but no suffix specified

by | Programming, Python, Tips

If you try to join together two DataFrames that share one or more column names but do not provide a suffix for either the right or left DataFrame to differentiate the between the columns, you will raise the ValueError: Columns overlap but no suffix specified.

To solve this error, you can use the merge function. For example df1.merge(df2, how = 'left'). Or you can use the join() method and provide a suffix for the left and right DataFrames, for example,

df1.merge(df2, how='left', lsuffix='left', rsuffix='right')

This tutorial will go through the error in detail and how to solve it with code examples.

ValueError: Columns overlap but no suffix specified

In Python, a value is a piece of information stored within a particular object. We will encounter a ValueError in Python when using a built-in operation or function that receives an argument that is the right type but an inappropriate value. In this specific error, the data we use during a join operation is the correct type, DataFrame, but the DataFrames have one or more identical columns without a distinguishing suffix specified.


Let’s look at an example of two DataFrames containing information about particles that we want to join together. The first DataFrame contains the particle names and the masses of each particle. The second DataFrame contains the particle names, the charge, and the spin of each particle. Let’s look at the code:

import pandas as pd

df1 = pd.DataFrame({'particle_name':['electron', 'muon', 'tau', 'W-boson', 'proton', 'Higgs boson'],
'mass (MeV)':[0.51, 106.7, 1776.9, 80433.5, 938.3, 125100]})

df2 = pd.DataFrame({'particle_name':['electron', 'muon', 'tau', 'W-boson', 'proton', 'Higgs boson'],
'charge':[-1, -1, -1, -1, 1, 0],
'spin':[0.5, 0.5, 0.5, 1, 0.5, 0]})


Let’s run the first part of the program to see the DataFrames:

  particle_name  mass (MeV)
0      electron        0.51
1          muon      106.70
2           tau     1776.90
3       W-boson    80433.50
4        proton      938.30
5   Higgs boson   125100.0

  particle_name  charge  spin
0      electron      -1   0.5
1          muon      -1   0.5
2           tau      -1   0.5
3       W-boson      -1   1.0
4        proton       1   0.5
5   Higgs boson       0   0.0

Next, we will try to join the DataFrames using the join() method:

df3= df1.join(df2, how='left')

Let’s run the code to see what happens:

ValueError: columns overlap but no suffix specified: Index(['particle_name'], dtype='object')

The error occurs because df1 and df2 share the particle_name column, and there is no suffix to differentiate between the columns in the new DataFrame df3.

Solution #1: Use merge()

We can solve this error by using the merge() function. Let’s look at the revised code:

df3 = df1.merge(df2)

Let’s run the code to see the result:

  particle_name  mass (MeV)  charge  spin
0      electron        0.51      -1   0.5
1          muon      106.70      -1   0.5
2           tau     1776.90      -1   0.5
3       W-boson    80433.50      -1   1.0
4        proton      938.30       1   0.5
5   Higgs boson   125100.00       0   0.0

We successfully merged the DataFrames. The merge() method drops any value in the common column particle_name for the right DataFrame that already exists in the left DataFrame.

Solution #2: Use join() with lsuffix and rsuffix

We can use the join() method and provide a suffix name for the left and/or right DataFrames. The parameters to set are suffix is lsuffix for the left DataFrame and rsuffix for the right DataFrame. Let’s look at the revised code:

df3 = df1.join(df2, how='left', rsuffix='_2')

In the above code, we set the suffix for the right DataFrame to _2. Let’s run the code to get the result:

  particle_name  mass (MeV) particle_name_2  charge  spin
0      electron        0.51        electron      -1   0.5
1          muon      106.70            muon      -1   0.5
2           tau     1776.90             tau      -1   0.5
3       W-boson    80433.50         W-boson      -1   1.0
4        proton      938.30          proton       1   0.5
5   Higgs boson   125100.00     Higgs boson       0   0.0

We successfully joined the DataFrames using _2 as a suffix for particle_name in the right DataFrame.


Congratulations on reading to the end of the tutorial! The ValueError: Columns overlap but no suffix provided occurs when you try to join two DataFrames that share one or more columns with no distinguishing suffix provided. To solve this error you can use the merge() method or the join() method with the lsuffix and/or rsuffix parameter values specified.

For further reading on errors involving Pandas, go to the article:

How to Solve Python ValueError: Trailing data

For further reading on Pandas, go to the article: Introduction to Pandas: A Complete Tutorial for Beginners.

Have fun and happy researching!

Research Scientist at Moogsoft | + posts

Suf is a research scientist at Moogsoft, specializing in Natural Language Processing and Complex Networks. Previously he was a Postdoctoral Research Fellow in Data Science working on adaptations of cutting-edge physics analysis techniques to data-intensive problems in industry. In another life, he was an experimental particle physicist working on the ATLAS Experiment of the Large Hadron Collider. His passion is to share his experience as an academic moving into industry while continuing to pursue research. Find out more about the creator of the Research Scientist Pod here and sign up to the mailing list here!