Here is the easy method of how to calculate the count of unique values in one or multiple columns by using R.
Here is a data frame.
df <- data.frame( agent = as.character(c("David", "David", "David", "Kate", "Kate", "Alma", "Alma")), manager = as.character(c("Lisa", "Monica", "Karl", "Kianna", "Luna", "Kaylin", "Georgia")), team = as.character(c("Sales", "Sales", "Sales", "Billing", "Sales", "Sales", "Sales")) ) df # agent manager team # David Lisa Sales # David Monica Sales # David Karl Sales # Kate Kianna Billing # Kate Luna Sales # Alma Kaylin Sales # Alma Georgia Sales
To get a count of unique values by each column I will use n_distinct from dplyr.
Unique values in one column.
require(dplyr) n_distinct(df$manager) #[1] 7
If it is necessary to do that for all data frame columns then you can use R base functions sapply or lapply. The output will be in different formats.
sapply(df, n_distinct) #agent manager team # 3 7 2
For a distinct count in multiple but not all of the columns, you should specify them in the data frame.
sapply(df[2:3], n_distinct)
Count unique values in R by group
It is a little bit different. First of all, you should remove duplicates based on two columns – group and value what you want to count.
df %>% distinct(agent, team) %>% group_by(agent) %>% summarize("agent in teams" = n()) #agent `agent in teams` #<fct> <int> #Alma 1 #David 1 #Kate 2
Leave a Reply