Here is a simple way how to change datetime timezone in the R data frame. If the timezone is not present or is different, there can cause a problem in your calculations. Usually like incorrect results of time difference.
It might usually happen by combining multiple data frames. Here is one of the examples – R time-shift problem after merging data.
What is the problem?
Here is my data frame.
ndf <- structure( list( dt1 = structure( c( 1601894367.723, 1601894396.223, 1601894367.723, 1601894396.223 ), class = c("POSIXct", "POSIXt"), tzone = "UTC" ), dt2 = structure( c( 1601894451.737, 1601894451.737, 1601895005.243, 1601895005.243 ), class = c("POSIXct", "POSIXt"), tzone = "" ) ), row.names = c(NA, -4L), class = "data.frame" ) ndf # dt1 dt2 #1 2020-10-05 10:39:27 2020-10-05 13:40:51 #2 2020-10-05 10:39:56 2020-10-05 13:40:51 #3 2020-10-05 10:39:27 2020-10-05 13:50:05 #4 2020-10-05 10:39:56 2020-10-05 13:50:05
When I do time difference calculations, the results are incorrect. The difference in hours must be greater.
ndf$diff <- difftime(ndf$dt1, ndf$dt2, units = 'hours') ndf # dt1 dt2 diff #1 2020-10-05 10:39:27 2020-10-05 13:40:51 -0.02333722 hours #2 2020-10-05 10:39:56 2020-10-05 13:40:51 -0.01542056 hours #3 2020-10-05 10:39:27 2020-10-05 13:50:05 -0.17708889 hours #4 2020-10-05 10:39:56 2020-10-05 13:50:05 -0.16917222 hours
I can see that every datettime column has a different timezone that causes the problem. One has a UTC timezone, the other having none.
attr(ndf$dt1, "tzone") #[1] "UTC" attr(ndf$dt2, "tzone") #[1] ""
Change datetime timezone in R
You can do a timezone change like this.
ndf$dt2 <- as.POSIXct(format(ndf$dt2), tz = "UTC")
If I change the necessary datetime timezone for the column, the calculation works fine.
ndf$diff <- difftime(ndf$dt1, ndf$dt2, units = 'hours') ndf # dt1 dt2 diff #1 2020-10-05 10:39:27 2020-10-05 13:40:51 -0.02333722 hours #2 2020-10-05 10:39:56 2020-10-05 13:40:51 -0.01542056 hours #3 2020-10-05 10:39:27 2020-10-05 13:50:05 -0.17708889 hours #4 2020-10-05 10:39:56 2020-10-05 13:50:05 -0.16917222 hours
If you want to remove the timezone in POSIXct object then leave the tz parameter empty.
ndf$dt2 <- as.POSIXct(format(ndf$dt2), tz = "")
Leave a Reply