Plot moving average in R, plot rolling average in R, plot running average in R, Plot moving average in ggplot2, plot above or below average in R

Plot moving average in R using ggplot2

Here is how to plot the moving average (rolling average or running average) in R using ggplot2 and add actual data in different ways. In that way, you can track the moving average and look at the data around that. Sometimes it helps to spot anomalies in time series.

 

Here is data from the AirPassengers dataset.

df <- data.frame(Year = as.numeric(trunc(time(AirPassengers))), 
                 Month = month.abb[cycle(AirPassengers)], 
                 AirPassengers = as.numeric(AirPassengers))

df$Month <- factor(df$Month, levels = unique(df$Month))

require(lubridate)

df$date <- make_date(df$Year, df$Month, 1)

head(df)

#  Year Month AirPassengers       date
# 1 1949   Jan           112 1949-01-01
# 2 1949   Feb           118 1949-02-01
# 3 1949   Mar           132 1949-03-01
# 4 1949   Apr           129 1949-04-01
# 5 1949   May           121 1949-05-01
# 6 1949   Jun           135 1949-06-01

This data shows monthly totals of international airline passengers. There is seasonality, and I will calculate the moving average with a 12-month window.

I will be calculating simple moving averages using the function rollmean from the package zoo. It is also a good idea to try Tidyquant, which has geoms for moving averages and different types of moving averages.

 

Plot moving average in R using ggplot2

Here is a simple way how to plot the moving average using ggplot2 and the function rollmean.

require(ggplot2)
require(zoo)

ggplot(df, aes(x = date,
               y = AirPassengers)) +
  geom_line(color = "cadetblue", linewidth = 1) +
  geom_line(aes(y = rollmean(
    AirPassengers, 12, na.pad = TRUE, align = "right")), linewidth = 1) +
  theme_minimal() +
  theme(axis.title = element_blank())

plot rolling average in R

If you want to show actual data above and below the moving average in different colors, you can use the geom_ribbon.
Before that, it is necessary to add the necessary calculations to the data frame. There will be two ribbons. The first one is for the data above the rolling average, and the second one is for the values below.

df$MA <- rollmean(df$AirPassengers, 12, na.pad = TRUE, align = "right")

df$above_MA <- ifelse(df$AirPassengers >= df$MA, df$AirPassengers, df$MA)
df$below_MA <- ifelse(df$AirPassengers < df$MA, df$AirPassengers, df$MA)

df[10:15, ]

#   Year Month AirPassengers       date       MA above_MA below_MA
# 10 1949   Oct           119 1949-10-01       NA       NA       NA
# 11 1949   Nov           104 1949-11-01       NA       NA       NA
# 12 1949   Dec           118 1949-12-01 126.6667 126.6667 118.0000
# 13 1950   Jan           115 1950-01-01 126.9167 126.9167 115.0000
# 14 1950   Feb           126 1950-02-01 127.5833 127.5833 126.0000
# 15 1950   Mar           141 1950-03-01 128.3333 141.0000 128.3333

The result visually looks like this.

ggplot(df, aes(x = date,
               y = MA)) +
  geom_line(color = "cadetblue",
            linewidth = 1,
            linetype = 2) +
  geom_ribbon(aes(ymin = below_MA, ymax = MA),
              alpha = 0.2,
              fill = "tomato") +
  geom_ribbon(aes(ymin = MA, ymax = above_MA),
              alpha = 0.2,
              fill = "steelblue") +
  theme_minimal() +
  theme(axis.title = element_blank())

plot above and below moving average in ggplot2, plot data around moving average in R

 

Plot data points around the rolling average in R

It is possible to put behind the moving average line markers that represent values of actual data. By using a color gradient, you can show how much something is below or above the moving average. To do that, calculate the difference between the actual data and the moving average.

df$diff <- df$AirPassengers - df$MA

After that, you can use that in the geom_point.

ggplot(df, aes(x = date, y = MA)) +
  geom_point(aes(x = date, y = AirPassengers, color = diff),
             size = 2,
             shape = 16) +
  scale_colour_gradient2(
    midpoint = 0,
    low = "tomato",
    high = "steelblue",
    na.value = "transparent",
    guide = "none") +
  geom_line(color = "cadetblue",
            linewidth = 1,
            linetype = 2) +
  theme_minimal() +
  theme(axis.title = element_blank())

plot data points around ggplot2 line, show data points around around moving average in ggplot2

 

If you want to see more examples of how to implement color gradients in ggplot2, look at this post about color gradients in the jitter plot or gradient line chart.

color gradient in R using midpoint by group


Posted

in

,

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *