How to Find the Column Name with the Largest Value for Each Row using R

by | Programming, R, Tips

You can find the column name with the largest value across all rows using the colnames() function together with the apply function.

For example,

df$largest_col <-colnames(df)[apply(df, 1, which.max)]

This tutorial will go through how to perform this task with code examples.


Table of contents

Example

Let’s look at an example. First, we will define a data frame with three columns and ten rows with random integers between 10 and 1000.

x <- sample(10:1000, size = 10)

y <- sample(10:1000, size = 10)

z <- sample(10:1000, size = 10)

df <- data.frame(x,y,z)

df

Let’s run the code to see the data frame.

    x   y   z
1  646 787 662
2  263 690 515
3  984 187 153
4   27 106 814
5  672 225 658
6  289 439 458
7  543 611 526
8  899 272 159
9  701 370 882
10 274 885 564

We can find the column name with the largest value for each row in the data frame using the colnames() function combined with the apply() function. The colnames() function obtains or sets the names of columns in a matrix-like object. The apply() function applies a function across an array matrix or data frame. The syntax for the apply() function is

# Syntax

apply(X, # Array, matrix or data frame

MARGIN, # 1: rows, 2: columns, c(1,2): rows and columns

FUN, # Function to apply

...) # Additional arguments to fun

We can apply a function to every row of a data frame by setting 1 for the MARGIN argument.

We want to apply the which.max function to every row to get the column that has the largest value.

We use that column index to get the column name using colnames(df).

df$largest_column<-colnames(df)[apply(df,1,which.max)]

df

Let’s run the code to get the result:

     x   y   z largest_column
1  646 787 662              y
2  263 690 515              y
3  984 187 153              x
4   27 106 814              z
5  672 225 658              x
6  289 439 458              z
7  543 611 526              y
8  899 272 159              x
9  701 370 882              z
10 274 885 564              y

We successfully updated the data frame with a “largest_column” column that contains the column name with the largest value for each row.

Summary

Congratulations on reading to the end of this tutorial!

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

Have fun and happy researching!

Profile Picture
Senior Advisor, Data Science | [email protected] | + posts

Suf is a senior advisor in data science with deep expertise in Natural Language Processing, Complex Networks, and Anomaly Detection. Formerly a postdoctoral research fellow, he applied advanced physics techniques to tackle real-world, data-heavy industry challenges. Before that, he was a particle physicist at the ATLAS Experiment of the Large Hadron Collider. Now, he’s focused on bringing more fun and curiosity to the world of science and research online.

Buy Me a Coffee