If you try to perform a mathematical operation that calls the universal function ufunc.reduce on NumPy arrays containing numerical strings, you will raise the TypeError: cannot perform reduce with flexible type. To solve this error, you can cast the values in the array to float using astype(float). If you have a multidimensional array, you can put the values inside a DataFrame and perform the operations on the DataFrame columns.

This tutorial will go through the error in detail and how to solve it with code errors.


TypeError: cannot perform reduce with flexible type

Let’s break up the error message to understand what the error means. TypeError occurs whenever you attempt to use an illegal operation for a specific data type. The part “cannot perform reduce” tells us the method we are using is invoking reduce. The reduce() method is a universal function (ufunc). A ufunc is a vectorized wrapper for a function that takes a fixed number of specific inputs and produces a fixed number of specific outputs. The reduce method reduces an array’s dimension by one. Some NumPy functions call reduce, for example, mean(). The part “flexible type” refers to numerical strings, which are both strings and represent numbers. Only numerical values are suitable for the reduce function.

Example #1: Calculating mean using Numerical Strings

Let’s look at an example of a NumPy array containing numerical strings. We want to call the mean() function on the array to get the average value of the array. Let’s look at the code:

import numpy as np

data = np.array(['2', '4', '6', '8', '10', '12'])

mean = np.mean(data)

print(mean)

Let’s run the code to see what happens:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [30], in <cell line: 5>()
      1 import numpy as np
      3 data = np.array(['2', '4', '6', '8', '10', '12'])
----> 5 mean = np.mean(data)
      7 print(mean)

File <__array_function__ internals>:5, in mean(*args, **kwargs)

File ~/opt/anaconda3/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3440, in mean(a, axis, dtype, out, keepdims, where)
   3437     else:
   3438         return mean(axis=axis, dtype=dtype, out=out, **kwargs)
-> 3440 return _methods._mean(a, axis=axis, dtype=dtype,
   3441                       out=out, **kwargs)

File ~/opt/anaconda3/lib/python3.8/site-packages/numpy/core/_methods.py:179, in _mean(a, axis, dtype, out, keepdims, where)
    176         dtype = mu.dtype('f4')
    177         is_float16_result = True
--> 179 ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
    180 if isinstance(ret, mu.ndarray):
    181     ret = um.true_divide(
    182             ret, rcount, out=ret, casting='unsafe', subok=False)

TypeError: cannot perform reduce with flexible type

Our code throws the TypeError because we attempt to calculate the mean on an array of numerical strings. We can see that by calling the mean() function, we call the umr_sum method, which performs reduce, hence why the error refers to reduce.

Solution

We can cast the array values to float using the astype() method to solve this error. Let’s look at the revised code:

data_float = data.astype(float)

print(data_float)

print(data_float.dtype)

Let’s run the code to see the new array:

[ 2.  4.  6.  8. 10. 12.]
float64

Now that we have an array of floats, we can calculate the mean. Let’s run the code to see the result:

mean = np.mean(data_float)

print(mean)
7.0

We correctly calculated the mean value of the array of floats.

Example #2: Multidimensional Array

We can also encounter this error by creating a multidimensional array consisting of string and Integer/Float types. Let’s look at an example of a two-dimensional array containing the scores of three Quidditch players.

import numpy as np

# create a 2D Array
scores = np.array([['Player', 'Position', 'Score'],
    ['Harry', 'seeker', 5],
    ['Ron', 'keeper', 8],
    ['Severus', 'beater', 3]])

score_vals = scores[1:,2]

print(score_vals)

Let’s print the third column, which contains the scores:

['5' '8' '3']

We can see that values in the column are strings. Let’s try to calculate the mean score:

mean = score_vals.mean()

print(mean)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [32], in <cell line: 1>()
----> 1 mean = score_vals.mean()
      3 print(mean)

File ~/opt/anaconda3/lib/python3.8/site-packages/numpy/core/_methods.py:179, in _mean(a, axis, dtype, out, keepdims, where)
    176         dtype = mu.dtype('f4')
    177         is_float16_result = True
--> 179 ret = umr_sum(arr, axis, dtype, out, keepdims, where=where)
    180 if isinstance(ret, mu.ndarray):
    181     ret = um.true_divide(
    182             ret, rcount, out=ret, casting='unsafe', subok=False)

TypeError: cannot perform reduce with flexible type

The error occurs because we are trying to calculate the mean on strings instead of float or integers.

Solution

We can use a Pandas DataFrame instead of a two-dimensional NumPy array. Let’s look at the revised code:

import pandas as pd

scores = pd.DataFrame({'Player':['Harry', 'Ron', 'Severus'],
'Position':['seeker', 'keeper', 'beater'],
'Score':[5, 8, 3]
})

print(scores)
print(scores.Score)

Let’s run the code to see the DataFrame and the dtype of the Score column:

    Player Position  Score
0    Harry   seeker      5
1      Ron   keeper      8
2  Severus   beater      3
0    5
1    8
2    3
Name: Score, dtype: int64

The values in the Score column are integers. Let’s calculate the mean of the scores:

print(scores.mean())
Score    5.333333
dtype: float64

We successfully calculated the mean of the Quidditch scores.

Summary

Congratulations on reading to the end of this tutorial! The TypeError: cannot perform reduce with flexible type occurs when you try to perform mathematical operations that invoke ufunc.reduce on strings instead of integers or floats. You can solve this error by casing the numerical strings to float or integer. You can use a DataFrame instead of a multidimensional NumPy array if you have tabular data with column headers.

For further reading on TypeErrors involving NumPy, go to the article: How to Solve Python TypeError: ‘numpy.float64’ object cannot be interpreted as an integer.

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Have fun and happy researching!