I use rolling and running functions almost daily with financial time series. In my post A Whole New World with Chains and Pipes, I made this statement
I have noticed that rolling analysis with xts can sometimes be slow. as.matrix is my favorite way to speed things up, since I usually do not need xts powerful indexing and subsetting features.
I felt like I be a little more thorough, so I put together a couple of speed tests for running and rolling functions. Please let me know if there are ways to make these even speedier.
One method I had not seen was provided in the comments, so I promoted to the body of this post.
Hi, there is also nice benchmark (for variable window width) provided here:http://stackoverflow.com/quest...
require(microbenchmark)
require(ggplot2)
require(xts)
require(quantmod)
require(PerformanceAnalytics)
require(dplyr);require(magrittr)
require(Gmisc) #for pretty html tables
Random Data | matrix, data.frame, and xts with dplyr
For our tests, let's make some random data with 10 columns of 100,000 rows. If this is daily financial data, 100,000 takes us back to about 1740. I wish I had good data back to then. We'll create a matrix
, data.frame
, and xts
object.
playData.matrix <- matrix(runif(1000000,-.03,.03), ncol=10)
playData.df <-
playData.matrix %>%
as.data.frame %>%
tbl_df
#make some dates by adding days to the first day of 1740
#in a playful magrittr way (please don't do this at homr)
playDates <- 1:(playData.matrix %>% nrow) %>% as.Date(origin="1740-01-01")
#make an xts dataset with our playData with index playDates
playData.xts <-
playData.matrix %>%
as.xts(order.by=playDates)
Running | Beauty of Fortran
The ttr
authors have made the running functions really speedy with Fortran and C.
mb_runMean <- microbenchmark(
runMean_matrix = runMean(playData.matrix, n = 50 )
,runMean_df = runMean( playData.df, n = 50 )
,runMean_xts = runMean( playData.xts, n = 50 )
#show the beauty of Fortran in the above runMean calcs
,runMean_rollapply = rollapply(playData.matrix, width = 50, by = 1, FUN="mean")
,times=10L)
runMean microbenchmark results
summary | expr | min | lq | median | uq | max | neval |
---|---|---|---|---|---|---|---|
1 | runMean_matrix | 7.83479 | 8.328931 | 10.187089 | 15.631568 | 20.428914 | 10 |
2 | runMean_df | 415.200526 | 436.36685 | 463.7532645 | 504.178149 | 565.733628 | 10 |
3 | runMean_xts | 19.379056 | 22.689037 | 27.419294 | 31.679356 | 34.873025 | 10 |
4 | runMean_rollapply | 40530.101527 | 41679.322495 | 42675.2747235 | 43917.396031 | 47993.732396 | 10 |
Rolling | I Miss Fortran
Rolling functions get a lot slower without Fortran and C. However, we can convert with as.matrix
or as.numeric
to speed things up a little bit. Once we're done calculating we will need to convert back to xts
with a little date logic.
mb_rollapply <- microbenchmark(
ra_matrix = rollapply( playData.matrix[,1:2], width = 100 , by = 1, FUN="Omega" )
,ra_matrix2 = apply(
playData.matrix[,1:2],
MARGIN=2,
FUN=function(col){ return(rollapply(col,width = 100 , by = 1, FUN="Omega" )) }
)
,ra_xts = rollapply( playData.xts[,1:2], width = 100 , by = 1, FUN="Omega" )
,ra_xts2 = apply(
playData.xts[,1:2],
MARGIN=2,
FUN=function(col){
return(
rollapply(
as.numeric(col)
,width=100
,by=1
,FUN="Omega"
)
)
}
)
,times=2L)
rollapply microbenchmark results
summary | expr | min | lq | median | uq | max | neval |
---|---|---|---|---|---|---|---|
1 | ra_matrix | 26.001762182 | 26.001762182 | 26.8966084855 | 27.791454789 | 27.791454789 | 2 |
2 | ra_matrix2 | 22.331454712 | 22.331454712 | 24.8568809085 | 27.382307105 | 27.382307105 | 2 |
3 | ra_matrix3 | 25.255073874 | 25.255073874 | 26.3310039035 | 27.406933933 | 27.406933933 | 2 |
4 | ra_xts | 308.954120943 | 308.954120943 | 309.3609896875 | 309.767858432 | 309.767858432 | 2 |
5 | ra_xts2 | 27.154306816 | 27.154306816 | 28.6303421355 | 30.106377455 | 30.106377455 | 2 |
Hi, thanks for the great post. Perhaps you could comment on this? http://stackoverflow.com/questions/24647075/computing-a-weighted-rolling-average-r?noredirect=1#comment38206679_24647075
ReplyDeletevery interesting; thanks, I might just try it out. For now, I will definitely add in the body of the post, so folks are aware.
ReplyDelete