Unpivot data in R, pivot longer in R, pivot_longer tidyr, from wide to long in R

Unpivot data in R with the pivot_longer from tidyr

Here are multiple examples with the pivot_longer from tidyr, which is an excellent choice if you want to unpivot data in R and transform the data frame from wide to long.

 

My data frame contains randomly generated values. Here is more about that.

mydates <-
  data.frame("date" = seq.Date(
    from = as.Date("2023-03-01"),
    to = as.Date("2023-03-05"),
    by = 'days'))

set.seed(1234)

d <- sample(seq(as.Date("2023-03-01"), as.Date("2023-03-05"), by = "day"), 
            5, replace = TRUE)
x <- sample(50:100, 5, replace = TRUE)
y <- sample(50:100, 5, replace = TRUE)

df <- data.frame("date" = d, "incoming" = x, "outgoing" = y)


require(dplyr)

df <- left_join(mydates, df, multiple = "all") %>% 
  arrange(date)

df

#        date incoming outgoing
# 1 2023-03-01       83       64
# 2 2023-03-02       87       71
# 3 2023-03-03       NA       NA
# 4 2023-03-04       54       88
# 5 2023-03-04       53       55
# 6 2023-03-05       65       75

 

 

Unpivot data in R

Here is a simple example of how to unpivot in R and put the variable into rows. You can specify columns that you want to unpivot like this or use a range of columns or tidyr functions like the starts_with.

require(tidyr)

df %>% pivot_longer(
  cols = c(incoming, outgoing),
  names_to = "category",
  values_to = "value")

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-03 incoming    NA
# 6  2023-03-03 outgoing    NA
# 7  2023-03-04 incoming    54
# 8  2023-03-04 outgoing    88
# 9  2023-03-04 incoming    53
# 10 2023-03-04 outgoing    55
# 11 2023-03-05 incoming    65
# 12 2023-03-05 outgoing    75

 

Alternatively, you can specify a column that you don’t want to unpivot like this. The result is the same.

df %>% pivot_longer(
  cols = !date,
  names_to = "category",
  values_to = "value")

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-03 incoming    NA
# 6  2023-03-03 outgoing    NA
# 7  2023-03-04 incoming    54
# 8  2023-03-04 outgoing    88
# 9  2023-03-04 incoming    53
# 10 2023-03-04 outgoing    55
# 11 2023-03-05 incoming    65
# 12 2023-03-05 outgoing    75

 

pivot_longer in R and missing values

There are at least two ways to deal with missing or, in other words, NA values using the pivot_longer. Here is the first one. With the argument values_drop_na, you can remove rows that contain missing values in the result.

df %>% pivot_longer(
  cols = c(incoming, outgoing),
  names_to = "category",
  values_to = "value",
  values_drop_na = TRUE)

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-04 incoming    54
# 6  2023-03-04 outgoing    88
# 7  2023-03-04 incoming    53
# 8  2023-03-04 outgoing    55
# 9  2023-03-05 incoming    65
# 10 2023-03-05 outgoing    75

 

If you want to keep rows with missing values and replace them, you can use the argument values_transform. The tidyr contains the function replace_na, and here is how to use that with the pivot_longer.

df %>% pivot_longer(
  cols = c(incoming, outgoing),
  names_to = "category",
  values_to = "value",
  values_transform = list(value = function(x) replace_na(x, 0)))

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-03 incoming     0
# 6  2023-03-03 outgoing     0
# 7  2023-03-04 incoming    54
# 8  2023-03-04 outgoing    88
# 9  2023-03-04 incoming    53
# 10 2023-03-04 outgoing    55
# 11 2023-03-05 incoming    65
# 12 2023-03-05 outgoing    75

 

Unpivot data in R and summarize the results

If the data after unpivoting contains values that are necessary to summarise, here is how to do that. As you can see, there are multiple values for the fourth of March for the same categories.

df %>% pivot_longer(
  cols = c(incoming, outgoing),
  names_to = "category",
  values_to = "value")

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-03 incoming    NA
# 6  2023-03-03 outgoing    NA
# 7  2023-03-04 incoming    54
# 8  2023-03-04 outgoing    88
# 9  2023-03-04 incoming    53
# 10 2023-03-04 outgoing    55
# 11 2023-03-05 incoming    65
# 12 2023-03-05 outgoing    75

 

You can use dplyr functions like the group_by and summarize and sum what is necessary.

df %>% pivot_longer(
  cols = c(incoming, outgoing),
  names_to = "category",
  values_to = "value") %>% 
  group_by(date, category) %>% 
  summarise("value" = sum(value))

#         date category value
# 1  2023-03-01 incoming    83
# 2  2023-03-01 outgoing    64
# 3  2023-03-02 incoming    87
# 4  2023-03-02 outgoing    71
# 5  2023-03-03 incoming    NA
# 6  2023-03-03 outgoing    NA
# 7  2023-03-04 incoming   107
# 8  2023-03-04 outgoing   143
# 9  2023-03-05 incoming    65
# 10 2023-03-05 outgoing    75

 

Here is another post if you want to do the opposite and pivot data in R from long to wide with the pivot_wider from tidyr.


Posted

in

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *