filter by data frame row number in R

How to filter by data frame row number in R

RStudio data viewer is a great tool to look into data, but sometimes it is necessary to filter by data frame row number in R. By importing files, you might get a warning from parsing with a specified row number, and it might be necessary to do further investigation.

Imagine that you are importing a text file with the read_delim function from the readr package. Sometimes, as a result, you might get warnings like below. What exactly happened with one or multiple rows that are causing parsing failures?

Warning: 2 parsing failures.
   row col expected actual               file
 14756  X7 a double   NULL 'C:/source/my.txt'
107524  X7 a double   NULL 'C:/source/my.txt'

Here is a data frame that I will use in the examples below.

head(mtcars)

#                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
#Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Here are a couple of examples of filtering in R that are not so specific.

 

Filter by data frame row number in R

base

It is quite simple to filter by data frame row number in R if you know how the square brackets work. The first element is dedicated to rows and the other to columns. It is easy to remember where is rows and columns if you are an Excel user and know the R1C1 cell reference style.

Here is how to get the third row from the data frame.

mtcars[3,]

#            mpg cyl disp hp drat   wt  qsec vs am gear carb
#Datsun 710 22.8   4  108 93 3.85 2.32 18.61  1  1    4    1

Here is how to filter multiple separate rows from the data frame in R.

mtcars[c(3,5),]

#                   mpg cyl disp  hp drat   wt  qsec vs am gear carb
#Datsun 710        22.8   4  108  93 3.85 2.32 18.61  1  1    4    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.44 17.02  0  0    3    2

 

dplyr

Here is the same situation that was in base R, but this time with dplyr capabilities and functions like row_number and filter.

require(dplyr)

mtcars %>% filter(row_number() == 3)

#            mpg cyl disp hp drat   wt  qsec vs am gear carb
#Datsun 710 22.8   4  108 93 3.85 2.32 18.61  1  1    4    1

You can also use the base function row instead of row_number.

It is even easier to subset rows by index with the dplyr function slice.

mtcars %>% slice(3)

#            mpg cyl disp hp drat   wt  qsec vs am gear carb
#Datsun 710 22.8   4  108 93 3.85 2.32 18.61  1  1    4    1

If you want to filter by specifying multiple separate rows with dplyr, then you can do that by using the %in% operator or quickly with function slice.

mtcars %>% filter(row_number() %in% c(3, 5))

#                   mpg cyl disp  hp drat   wt  qsec vs am gear carb
#Datsun 710        22.8   4  108  93 3.85 2.32 18.61  1  1    4    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.44 17.02  0  0    3    2

mtcars %>% slice(3, 5)

#                   mpg cyl disp  hp drat   wt  qsec vs am gear carb
#Datsun 710        22.8   4  108  93 3.85 2.32 18.61  1  1    4    1
#Hornet Sportabout 18.7   8  360 175 3.15 3.44 17.02  0  0    3    2

Here is how to implement “not in” operator in R.

Pull values from the row in R

If you want to pull only values from a data frame row to create a vector, then here is how to do that.

as.character(as.vector(mtcars[3,]))

# [1] "22.8"  "4"     "108"   "93"    "3.85"  "2.32"  "18.61" "1"     "1"     "4"     "1"




1 comment on “How to filter by data frame row number in R

  1. Carina Moreno

    Very useful, thank you!

Leave a Reply

Your email address will not be published.