Remove duplicates and keep last in R

Remove duplicates and keep last in R

Here is how to remove duplicates but keep the last row in the R data frame.

Here is my data frame that contains agents and managers. Let’s say the last one is that I would like to get.

df <- data.frame(
agent = as.character(c("David", "David", "David", "Kate", "Kate", "Alma", "Alma")),
manager = as.character(c("Lisa", "Monica", "Karl", "Kianna", "Luna", "Kaylin", "Georgia"))
)

It is easy to keep the last unique record by using dplyr. Distinct functions return the first record and that is the reason you should use a little workaround.

Group by at least one category that you are interested in. Function row_count will get the count of rows by each group. Filter by last row number, which equals the result of function n. N that gives the current group size.

df <- df %>%
group_by(agent) %>%
filter(row_number() == n())

How to get top or bottom values by each group in R




4 comments

  1. thank you very much, i was struggling with this problem and didn’t know about row_number() function. Greetings from Perú

    1. Is it possible to do the same but with the content of the column and not by the position? (intead of last row number, “Karl”). Thanks

      1. Maybe. If there is some principle that can define sequence and works in all situations like the alphabet or group index. In that case, you can use arrange() and distinct() from dplyr.

Leave a comment

Your email address will not be published. Required fields are marked *

Exit mobile version