How to get top or bottom values by each group in R

How to get top or bottom values by each group in R

It is easy to return top or bottom values by a group with functions slice_min and slice_max from dplyr in R. If you can’t find those functions, then you have to update dplyr.

You might take a look at slice because there is a lot of things that you can do with these or other slice functions or a bunch of available parameters.

3 bottom or smallest values by category with function slice_min

require(dplyr)

gender_mass <- starwars %>% select(gender, mass)

gender_mass1 <- gender_mass %>%
group_by(gender) %>%
slice_min(mass, n = 3) %>%
arrange(gender)

gender_mass1
# A tibble: 8 x 2
# Groups:   gender [3]
#  gender     mass
#       
#1 feminine     45
#2 feminine     49
#3 feminine     50
#4 feminine     50
#5 masculine    15
#6 masculine    17
#7 masculine    20
#8 NA           48

As you can see, there are more than 3 records from each category because of ties. You can use a parameter with_ties = FALSE to suppress that. At the same time, there will be 3 lines returned even there is not more than one.

require(dplyr)

gender_mass <- starwars %>% select(gender, mass)

gender_mass2 <- gender_mass %>%
group_by(gender) %>%
slice_min(mass, n = 3, with_ties = F) %>%
arrange(gender)

gender_mass2
# A tibble: 9 x 2
# Groups:   gender [3]
#  gender     mass
#       
#1 feminine     45
#2 feminine     49
#3 feminine     50
#4 masculine    15
#5 masculine    17
#6 masculine    20
#7 NA           48
#8 NA           NA
#9 NA           NA

Similarly, you can get top or largest values by category in R with function slice_max.

Top and bottom values in R in the same data frame

If you need top and bottom n values in the same data frame, you can combine them with dplyr function like bind_rows.

require(dplyr)

gender_mass <- starwars %>% select(gender, mass)

gender_mass3 <- bind_rows(

gender_mass %>%
group_by(gender) %>%
slice_min(mass, n = 3, with_ties = F) %>%
mutate(portion = "bottom"),

gender_mass %>%
group_by(gender) %>%
slice_max(mass, n = 3, with_ties = F) %>%
mutate(portion = "top")

)

gender_mass3

# A tibble: 18 x 3
# Groups:   gender [3]
#gender      mass portion
#         
#  1 feminine    45   bottom 
#2 feminine    49   bottom 
#3 feminine    50   bottom 
#4 masculine   15   bottom 
#5 masculine   17   bottom 
#6 masculine   20   bottom 
#7 NA          48   bottom 
#8 NA          NA   bottom 
#9 NA          NA   bottom 
#10 feminine    75   top    
#11 feminine    57   top    
#12 feminine    56.2 top    
#13 masculine 1358   top    
#14 masculine  159   top    
#15 masculine  140   top    
#16 NA          48   top    
#17 NA          NA   top    
#18 NA          NA   top

 

Here is a similar post on how to get minimum or maximum value by each group in R.

With a lot of piping in your code, it might be hard to read. Take a look at my favorite RStudio tips and tricks when you can find how to easily reflow your code.

My favorite RStudio tips and tricks





Posted

in

,

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *