previous row in R data.table, next row in R data.table, next record by group in data.table, previous record by group data.table

Value from next or previous row in R data frame using data.table

If you are working with large datasets and want to get value from the next or previous row in R, try to use the data.table. It is possible to do the same with base R or dplyr, but it might be too slow. Basically, this approach with the data.table to get value from the next or previous row in R is useful to do that fast.

 

Here is my data set. The code for this dataset might look a little clunky, but if you have a table in Excel, it is easy to copy that into R and transform the data frame into reproducible code.

df <- structure(list(year = c(2021L, 2022L, 2021L, 2022L, 2021L, 2022L, 2021L, 2022L), 
                     salesperson = c("Jack", "Jack", "Jack", "Jack", 
                                    "Emma", "Emma", "Emma", "Emma"), 
                     category = c("A", "A", "B", "B", "A", "A", "B", "B"), 
                     value = c(21L, 100L, 46L, 33L, 16L, 80L, 52L, 23L)), 
                     class = "data.frame", row.names = c(NA, -8L))

df

#  year salesperson category value
# 1 2021        Jack        A    21
# 2 2022        Jack        A   100
# 3 2021        Jack        B    46
# 4 2022        Jack        B    33
# 5 2021        Emma        A    16
# 6 2022        Emma        A    80
# 7 2021        Emma        B    52
# 8 2022        Emma        B    23

 

To get the next or previous value in R, you can use the shift function with vectors, lists, data frames, or data tables.

Notice that this is a data frame object.

class(df)

#[1] "data.frame"

If you want to use it as data.table, do this transformation first.

require(data.table)

df <- as.data.table(df)

 

Value from the next or previous row in the R data frame using data.table

Here is how to get value from the previous row using the shift function with the data frame object.

df$prev_value <- shift(df$value, 1, type = "lag")

df
#  year salesperson category value prev_value
# 1 2021        Jack        A    21         NA
# 2 2022        Jack        A   100         21
# 3 2021        Jack        B    46        100
# 4 2022        Jack        B    33         46
# 5 2021        Emma        A    16         33
# 6 2022        Emma        A    80         16
# 7 2021        Emma        B    52         80
# 8 2022        Emma        B    23         52

Here is how it looks with the data table object. The last two square brackets are necessary if you want to create a new column and look at the result.

df[, prev_value := shift(value, 1, type = "lag")][]

#   year salesperson category value prev_value
# 1: 2021        Jack        A    21         NA
# 2: 2022        Jack        A   100         21
# 3: 2021        Jack        B    46        100
# 4: 2022        Jack        B    33         46
# 5: 2021        Emma        A    16         33
# 6: 2022        Emma        A    80         16
# 7: 2021        Emma        B    52         80
# 8: 2022        Emma        B    23         52

Here is how to get the next record.

#in data.frame
df$next_value <- shift(df$value, 1, type = "lead")

df
#   year salesperson category value prev_value next_value
# 1: 2021        Jack        A    21         NA        100
# 2: 2022        Jack        A   100         21         46
# 3: 2021        Jack        B    46        100         33
# 4: 2022        Jack        B    33         46         16
# 5: 2021        Emma        A    16         33         80
# 6: 2022        Emma        A    80         16         52
# 7: 2021        Emma        B    52         80         23
# 8: 2022        Emma        B    23         52         NA

#in data.table
df[, next_value := shift(value, 1, type = "lead")]

 

R data.table previous or next record by the group

Here is how to use one grouping variable to get previous or next values.

Firstly, I will quickly drop unnecessary columns.

df$prev_value <- df$next_value <- NULL

df[, prev_value := shift(value, 1, type = "lag"), by = salesperson][]

#   year salesperson category value prev_value
# 1: 2021        Jack        A    21         NA
# 2: 2022        Jack        A   100         21
# 3: 2021        Jack        B    46        100
# 4: 2022        Jack        B    33         46
# 5: 2021        Emma        A    16         NA
# 6: 2022        Emma        A    80         16
# 7: 2021        Emma        B    52         80
# 8: 2022        Emma        B    23         52


df[, next_value := shift(value, 1, type = "lead"), by = salesperson][]

#   year salesperson category value prev_value next_value
# 1: 2021        Jack        A    21         NA        100
# 2: 2022        Jack        A   100         21         46
# 3: 2021        Jack        B    46        100         33
# 4: 2022        Jack        B    33         46         NA
# 5: 2021        Emma        A    16         NA         80
# 6: 2022        Emma        A    80         16         52
# 7: 2021        Emma        B    52         80         23
# 8: 2022        Emma        B    23         52         NA

If you have multiple grouping variables, here is how to implement that.

df$prev_value <- df$next_value <- NULL

df[, prev_value := shift(value, 1, type = "lag"), by = list(salesperson, category)][]

#   year salesperson category value prev_value
# 1: 2021        Jack        A    21         NA
# 2: 2022        Jack        A   100         21
# 3: 2021        Jack        B    46         NA
# 4: 2022        Jack        B    33         46
# 5: 2021        Emma        A    16         NA
# 6: 2022        Emma        A    80         16
# 7: 2021        Emma        B    52         NA
# 8: 2022        Emma        B    23         52

df[, next_value := shift(value, 1, type = "lead"), by = list(salesperson, category)][]

#   year salesperson category value prev_value next_value
# 1: 2021        Jack        A    21         NA        100
# 2: 2022        Jack        A   100         21         NA
# 3: 2021        Jack        B    46         NA         33
# 4: 2022        Jack        B    33         46         NA
# 5: 2021        Emma        A    16         NA         80
# 6: 2022        Emma        A    80         16         NA
# 7: 2021        Emma        B    52         NA         23
# 8: 2022        Emma        B    23         52         NA

Posted

in

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *