It is easy to return top or bottom values by a group with functions slice_min and slice_max from dplyr in R. If you can’t find those functions, then you have to update dplyr.
You might take a look at slice because there is a lot of things that you can do with these or other slice functions or a bunch of available parameters.
3 bottom or smallest values by category with function slice_min
require(dplyr) gender_mass <- starwars %>% select(gender, mass) gender_mass1 <- gender_mass %>% group_by(gender) %>% slice_min(mass, n = 3) %>% arrange(gender) gender_mass1 # A tibble: 8 x 2 # Groups: gender [3] # gender mass # #1 feminine 45 #2 feminine 49 #3 feminine 50 #4 feminine 50 #5 masculine 15 #6 masculine 17 #7 masculine 20 #8 NA 48
As you can see, there are more than 3 records from each category because of ties. You can use a parameter with_ties = FALSE
to suppress that. At the same time, there will be 3 lines returned even there is not more than one.
require(dplyr) gender_mass <- starwars %>% select(gender, mass) gender_mass2 <- gender_mass %>% group_by(gender) %>% slice_min(mass, n = 3, with_ties = F) %>% arrange(gender) gender_mass2 # A tibble: 9 x 2 # Groups: gender [3] # gender mass # #1 feminine 45 #2 feminine 49 #3 feminine 50 #4 masculine 15 #5 masculine 17 #6 masculine 20 #7 NA 48 #8 NA NA #9 NA NA
Similarly, you can get top or largest values by category in R with function slice_max.
Top and bottom values in R in the same data frame
If you need top and bottom n values in the same data frame, you can combine them with dplyr function like bind_rows.
require(dplyr) gender_mass <- starwars %>% select(gender, mass) gender_mass3 <- bind_rows( gender_mass %>% group_by(gender) %>% slice_min(mass, n = 3, with_ties = F) %>% mutate(portion = "bottom"), gender_mass %>% group_by(gender) %>% slice_max(mass, n = 3, with_ties = F) %>% mutate(portion = "top") ) gender_mass3 # A tibble: 18 x 3 # Groups: gender [3] #gender mass portion # # 1 feminine 45 bottom #2 feminine 49 bottom #3 feminine 50 bottom #4 masculine 15 bottom #5 masculine 17 bottom #6 masculine 20 bottom #7 NA 48 bottom #8 NA NA bottom #9 NA NA bottom #10 feminine 75 top #11 feminine 57 top #12 feminine 56.2 top #13 masculine 1358 top #14 masculine 159 top #15 masculine 140 top #16 NA 48 top #17 NA NA top #18 NA NA top
Here is a similar post on how to get minimum or maximum value by each group in R.
With a lot of piping in your code, it might be hard to read. Take a look at my favorite RStudio tips and tricks when you can find how to easily reflow your code.
Leave a Reply