You can calculate the moving average (also called a running or rolling average) in different ways by using R packages.
Running average with dplyr
Here is one of the scenarios that can be executed with dplyr. I will use R built-in dataset airquality.
head(airquality) # Ozone Solar.R Wind Temp Month Day #1 41 190 7.4 67 5 1 #2 36 118 8.0 72 5 2 #3 12 149 12.6 74 5 3 #4 18 313 11.5 62 5 4 #5 NA NA 14.3 56 5 5 #6 28 NA 14.9 66 5 6
It contains wind measurements by every day of the month. Let’s say I want to calculate the running average for each month.
With dplyr, it can be done mathematically. I will create temporary column rec. That column will be used by the base function cumsum to calculate average wind speed at every necessary point. Here is another example with a cumulative sum that you can use to explore cumsum.
require(dplyr) airquality <- airquality %>% group_by(Month) %>% mutate(rec = 1) %>% mutate(rollavg = cumsum(Wind)/cumsum(rec)) %>% select(-rec) head(as.data.frame(airquality)) # Ozone Solar.R Wind Temp Month Day rollavg #1 41 190 7.4 67 5 1 7.400000 #2 36 118 8.0 72 5 2 7.700000 #3 12 149 12.6 74 5 3 9.333333 #4 18 313 11.5 62 5 4 9.875000 #5 NA NA 14.3 56 5 5 10.760000 #6 28 NA 14.9 66 5 6 11.450000
If you don’t like a lot of decimal numbers, you can use rounding or formatting. Here is another example of scientific notation.
airquality$rollavg <- format(airquality$rollavg, digits = 3) head(as.data.frame(airquality)) # Ozone Solar.R Wind Temp Month Day rollavg #1 41 190 7.4 67 5 1 7.40 #2 36 118 8.0 72 5 2 7.70 #3 12 149 12.6 74 5 3 9.33 #4 18 313 11.5 62 5 4 9.88 #5 NA NA 14.3 56 5 5 10.76 #6 28 NA 14.9 66 5 6 11.45
Moving, rolling average in R
One of the best ways to calculate rolling average in R or any other rolling calculation is using package RcppRoll. There are a lot of functions that start with “roll…” that can calculate the rolling average, rolling minimum, maximum, etc. You can also calculation in a lot of variations – 7 day rolling average, 14 day rolling average, etc. You can also use it in dplyr mutate like cumsum in the previous example.
7 day moving average in R goes like this.
require(RcppRoll) airquality$d7_rollavg <- roll_mean(airquality$Wind, n = 7, align = "right", fill = NA) airquality$d7_rollavg <- format(airquality$d7_rollavg, digits = 3) head(as.data.frame(airquality), n= 10) # Ozone Solar.R Wind Temp Month Day rollavg d7_rollavg #1 41 190 7.4 67 5 1 7.40 NA #2 36 118 8.0 72 5 2 7.70 NA #3 12 149 12.6 74 5 3 9.33 NA #4 18 313 11.5 62 5 4 9.88 NA #5 NA NA 14.3 56 5 5 10.76 NA #6 28 NA 14.9 66 5 6 11.45 NA #7 23 299 8.6 65 5 7 11.04 11.04 #8 19 99 13.8 59 5 8 11.39 11.96 #9 8 19 20.1 61 5 9 12.36 13.69 #10 NA 194 8.6 69 5 10 11.98 13.11
Leave a Reply