Select Page

# How to Count the Number of NA in R

by | Programming, R, Tips

This tutorial will go through counting the number of missing values or NAs in a data frame in R.

## Example

Let’s look at an example using built-in data `airquality`.

### Get Airquality Data

First, let’s look at the head of the `airquality` dataset.

`head(airquality)`
``` Ozone Solar.R Wind Temp Month Day
1    41     190  7.4   67     5   1
2    36     118  8.0   72     5   2
3    12     149 12.6   74     5   3
4    18     313 11.5   62     5   4
5    NA      NA 14.3   56     5   5
6    28      NA 14.9   66     5   6```

We can see that there are `NA` values in the data frame, but we need to determine how many there are.

### Solution #1: Use summary

The simplest way to get the number of `NAs` in the data frame is to use the `summary` method. Let’s look at the implementation of `summary`:

`summary(airquality)`
```   Ozone           Solar.R           Wind             Temp           Month
Min.   :  1.00   Min.   :  7.0   Min.   : 1.700   Min.   :56.00   Min.   :5.000
1st Qu.: 18.00   1st Qu.:115.8   1st Qu.: 7.400   1st Qu.:72.00   1st Qu.:6.000
Median : 31.50   Median :205.0   Median : 9.700   Median :79.00   Median :7.000
Mean   : 42.13   Mean   :185.9   Mean   : 9.958   Mean   :77.88   Mean   :6.993
3rd Qu.: 63.25   3rd Qu.:258.8   3rd Qu.:11.500   3rd Qu.:85.00   3rd Qu.:8.000
Max.   :168.00   Max.   :334.0   Max.   :20.700   Max.   :97.00   Max.   :9.000
NA's   :37       NA's   :7
Day
Min.   : 1.0
1st Qu.: 8.0
Median :16.0
Mean   :15.8
3rd Qu.:23.0
Max.   :31.0
```

The summary method returns statistical summaries of each column in the data frame and the `NAs` in each column. We can see there are 37 `NA` values in `Ozone` and 7 `NA` values in `Solar.R`.

### Solution #2: Use sum and is.na

The second way we can get the total number of `NAs` in the data frame is to call `is.na` which returns `TRUE` or `FALSE` for each value in a data set and `sum()` sums up the `TRUE` values. Let’s look at the code:

```sum(is.na(airquality))
```

Let’s run the code to see the result:

`[1] 44`

There is a total of 44 `NAs` in the data frame.

### Solution #3: Use sum and is.na in Function

If we want to get the number of `NAs` per column in a data frame we can define a function to iterate over each column and count the `NAs` using `sum()` and `is.na`. Let’s look at the code:

```res <- NULL

f <- function(x) {

for (i in 1:ncol(x)){

temp<-sum(is.na(x[,i]))

temp<-as.data.frame(temp)

temp\$var<colnames(x)[i]

res<-rbind(res,temp)

}

return(res)

}```

Let’s call the function to see the result:

`f(airquality)`
```  temp
1   37
2    7
3    0
4    0
5    0
6    0```

There are 37 `NAs` in the first column and 7 `NAs` in the second column.

## Summary

Congratulations on reading to the end of this tutorial!

Go to the online courses page on R to learn more about coding in R for data science and machine learning.

For further reading on data analysis with R, go to the article: How to Download and Plot Stock Prices with quantmod in R

Have fun and happy researching!