Remove duplicates and keep last in R

Here is how to remove duplicates but keep the last row in the R data frame.

Here is my data frame that contains agents and managers. Let’s say the last one is that I would like to get.

keep last row in R

df <- data.frame(
agent = as.character(c("David", "David", "David", "Kate", "Kate", "Alma", "Alma")),
manager = as.character(c("Lisa", "Monica", "Karl", "Kianna", "Luna", "Kaylin", "Georgia"))
)

It is easy to keep the last unique record by using dplyr. Distinct functions return the first record and that is the reason you should use a little workaround.

Group by at least one category that you are interested in. Function row_count will get the count of rows by each group. Filter by last row number, which equals the result of function n. N that gives the current group size.

df <- df %>%
group_by(agent) %>%
filter(row_number() == n())

How to get top or bottom values by each group in R

Posted

December 3, 2020

Comments

4 responses to “Remove duplicates and keep last in R”

Santiago
October 25, 2021
thank you very much, i was struggling with this problem and didn’t know about row_number() function. Greetings from Perú
Reply
1. Janis Sturis
  October 25, 2021
  Thank you for your feedback!
  Looks like world is united by R.
  Reply
2. Santiago
  October 25, 2021
  Is it possible to do the same but with the content of the column and not by the position? (intead of last row number, “Karl”). Thanks
  Reply
  1. Janis Sturis
    October 25, 2021
    Maybe. If there is some principle that can define sequence and works in all situations like the alphabet or group index. In that case, you can use arrange() and distinct() from dplyr.
    Reply

Remove duplicates and keep last in R

Comments

4 responses to “Remove duplicates and keep last in R”

Leave a Reply Cancel reply