Count unique values in column by using R

Count unique values in column by using R

Here is the easy method of how to calculate the count of unique values in one or multiple columns by using R.

Here is a data frame.

df <- data.frame(
agent = as.character(c("David", "David", "David", "Kate", "Kate", "Alma", "Alma")),
manager = as.character(c("Lisa", "Monica", "Karl", "Kianna", "Luna", "Kaylin", "Georgia")),
team = as.character(c("Sales", "Sales", "Sales", "Billing", "Sales", "Sales", "Sales"))
)

df

# agent manager team
# David Lisa Sales
# David Monica Sales
# David Karl Sales
# Kate Kianna Billing
# Kate Luna Sales
# Alma Kaylin Sales
# Alma Georgia Sales

To get a count of unique values by each column I will use n_distinct from dplyr.

Unique values in one column.

require(dplyr)

n_distinct(df$manager)

#[1] 7

If it is necessary to do that for all data frame columns then you can use R base functions sapply or lapply. The output will be in different formats.

sapply(df, n_distinct)

#agent manager team
# 3       7    2

For a distinct count in multiple but not all of the columns, you should specify them in the data frame.

sapply(df[2:3], n_distinct)

Count unique values in R by group

It is a little bit different. First of all, you should remove duplicates based on two columns – group and value what you want to count.

df %>%
distinct(agent, team) %>%
group_by(agent) %>%
summarize("agent in teams" = n())

#agent `agent in teams`
#<fct>         <int>
#Alma            1
#David           1
#Kate            2

Remove duplicates and keep last in R

 





Posted

in

,

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *