Here is the easy method of how to calculate the count of unique values in one or multiple columns by using R.
Here is a data frame.
df <- data.frame(
agent = as.character(c("David", "David", "David", "Kate", "Kate", "Alma", "Alma")),
manager = as.character(c("Lisa", "Monica", "Karl", "Kianna", "Luna", "Kaylin", "Georgia")),
team = as.character(c("Sales", "Sales", "Sales", "Billing", "Sales", "Sales", "Sales"))
)
df
# agent manager team
# David Lisa Sales
# David Monica Sales
# David Karl Sales
# Kate Kianna Billing
# Kate Luna Sales
# Alma Kaylin Sales
# Alma Georgia SalesTo get a count of unique values by each column I will use n_distinct from dplyr.
Unique values in one column.
require(dplyr) n_distinct(df$manager) #[1] 7
If it is necessary to do that for all data frame columns then you can use R base functions sapply or lapply. The output will be in different formats.
sapply(df, n_distinct) #agent manager team # 3 7 2
For a distinct count in multiple but not all of the columns, you should specify them in the data frame.
sapply(df[2:3], n_distinct)
Count unique values in R by group
It is a little bit different. First of all, you should remove duplicates based on two columns – group and value what you want to count.
df %>%
distinct(agent, team) %>%
group_by(agent) %>%
summarize("agent in teams" = n())
#agent `agent in teams`
#<fct> <int>
#Alma 1
#David 1
#Kate 2

Leave a Reply