This tutorial will go three ways to replace negative values with zero in a Pandas DataFrame.
This simplest way to do it is:
# Import pandas module import pandas as pd # Create pandas DataFrame df = pd.DataFrame({"X": [4, 5, -3, 4, -5, 6], "Y": [3, -5, -6, 7, 3, -2], "Z": [-4, 5, 6, -7, 5, 4]}) df[df<0] = 0
Table of contents
Example #1
Let’s look at an example of using boolean indexing to replace all negative values in a DataFrame with 0. First, we will import the pandas module and create the DataFrame.
# Import pandas module import pandas as pd # Create pandas DataFrame df = pd.DataFrame({"X": [4, 5, -3, 4, -5, 6], "Y": [3, -5, -6, 7, 3, -2], "Z": [-4, 5, 6, -7, 5, 4]}) print(df) df[df<0] = 0 print(df)
X Y Z 0 4 3 -4 1 5 -5 5 2 -3 -6 6 3 4 7 -7 4 -5 3 5 5 6 -2 4
Next, we will define a condition and apply it to all the values in the DataFrame:
# Replace all elements < 0 with 0 df[df<0] = 0 # View DataFrame print(df)
Let’s run the code to get the result:
X Y Z 0 4 3 0 1 5 0 5 2 0 0 6 3 4 7 0 4 0 3 5 5 6 0 4
We can see that the negative values in the DataFrame were replaced by zeros.
Example #2
Let’s look at an example where one of the columns is not numeric.
# Import pandas module import pandas as pd # Create pandas DataFrame df = pd.DataFrame({"X": [4, 5, -3, 4, -5, 6], "Y": [3, -5, -6, 7, 3, -2], "Z": ["do", "re", "mi", "fa", "so", "la"]}) print(df)
X Y Z 0 4 3 do 1 5 -5 re 2 -3 -6 mi 3 4 7 fa 4 -5 3 so 5 6 -2 la
In this case, we can use the _get_numeric_data()
method to get the numeric columns from the DataFrame. Then, we can apply boolean indexing to those columns.
# Get numeric columns from DataFrame num_cols = df._get_numeric_data() # Replace all elements < 0 with 0 num_cols[num_cols < 0 ] = 0 # View DataFrame print(df)
Let’s run the code to get the result:
X Y Z 0 4 3 do 1 5 0 re 2 0 0 mi 3 4 7 fa 4 0 3 so 5 6 0 la
We can see that the negative values in the DataFrame were replaced by zeros.
Example #3
Let’s look at an example of a DataFrame with dtype timedelta
columns. We can convert numerical columns to timedelta
using the to_timedelta()
method. In this case, we will express the numeric values as number of days.
# Import pandas module import pandas as pd # Create pandas DataFrame df = pd.DataFrame({"X": pd.to_timedelta([4, 5, -3, 4, -5, 6], 'd'), "Y": pd.to_timedelta([3, -5, -6, 7, 3, -2], 'd')}) print(df)
X Y Z 0 4 days 3 days do 1 5 days -5 days re 2 -3 days -6 days mi 3 4 days 7 days fa 4 -5 days 3 days so 5 6 days -2 days la
We can use pd.Timedelta
in the comparison as follows:
# Replace all values with timedelta < 0 days with 0 df[df < pd.Timedelta(0)] = 0 # View DataFrame print(df)
Let’s run the code to get the result:
X Y 0 4 days 00:00:00 3 days 00:00:00 1 5 days 00:00:00 0 2 0 0 3 4 days 00:00:00 7 days 00:00:00 4 0 3 days 00:00:00 5 6 days 00:00:00 0
We can see that the values with negative days were replaced by zeros.
Summary
Congratulations on reading to the end of this tutorial!
For further reading on converting negative values to zeros, go to the following article:
Python How to Replace Negative Value with Zero in Numpy Array
To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.
Have fun and happy researching!
Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.