Tuesday, July 8, 2014

Speed Tests for Rolling/Running Functions

I use rolling and running functions almost daily with financial time series. In my post A Whole New World with Chains and Pipes, I made this statement

I have noticed that rolling analysis with xts can sometimes be slow. as.matrix is my favorite way to speed things up, since I usually do not need xts powerful indexing and subsetting features.

I felt like I be a little more thorough, so I put together a couple of speed tests for running and rolling functions. Please let me know if there are ways to make these even speedier.

One method I had not seen was provided in the comments, so I promoted to the body of this post.

Hi, there is also nice benchmark (for variable window width) provided here:http://stackoverflow.com/quest...

require(microbenchmark)
require(ggplot2)

require(xts)
require(quantmod)
require(PerformanceAnalytics)

require(dplyr);require(magrittr)

require(Gmisc) #for pretty html tables


Random Data | matrix, data.frame, and xts with dplyr



For our tests, let's make some random data with 10 columns of 100,000 rows. If this is daily financial data, 100,000 takes us back to about 1740. I wish I had good data back to then. We'll create a matrix, data.frame, and xts object.

playData.matrix <- matrix(runif(1000000,-.03,.03), ncol=10)

playData.df <-
playData.matrix %>%
as.data.frame %>%
tbl_df

#make some dates by adding days to the first day of 1740
#in a playful magrittr way (please don't do this at homr)
playDates <- 1:(playData.matrix %>% nrow) %>% as.Date(origin="1740-01-01")


#make an xts dataset with our playData with index playDates
playData.xts <-
playData.matrix %>%
as.xts(order.by=playDates)


Running | Beauty of Fortran



The ttr authors have made the running functions really speedy with Fortran and C.

mb_runMean <- microbenchmark(
runMean_matrix = runMean(playData.matrix, n = 50 )
,runMean_df = runMean( playData.df, n = 50 )
,runMean_xts = runMean( playData.xts, n = 50 )
#show the beauty of Fortran in the above runMean calcs
,runMean_rollapply = rollapply(playData.matrix, width = 50, by = 1, FUN="mean")
,times=10L)


runMean microbenchmark results


summaryexprminlqmedianuqmaxneval
1runMean_matrix7.834798.32893110.18708915.63156820.42891410
2runMean_df415.200526436.36685463.7532645504.178149565.73362810
3runMean_xts19.37905622.68903727.41929431.67935634.87302510
4runMean_rollapply40530.10152741679.32249542675.274723543917.39603147993.73239610


plot of chunk unnamed-chunk-2



Rolling | I Miss Fortran



Rolling functions get a lot slower without Fortran and C. However, we can convert with as.matrix or as.numeric to speed things up a little bit. Once we're done calculating we will need to convert back to xts with a little date logic.

mb_rollapply <- microbenchmark(
ra_matrix = rollapply( playData.matrix[,1:2], width = 100 , by = 1, FUN="Omega" )

,ra_matrix2 = apply(
playData.matrix[,1:2],
MARGIN=2,
FUN=function(col){ return(rollapply(col,width = 100 , by = 1, FUN="Omega" )) }
)

,ra_xts = rollapply( playData.xts[,1:2], width = 100 , by = 1, FUN="Omega" )

,ra_xts2 = apply(
playData.xts[,1:2],
MARGIN=2,
FUN=function(col){
return(
rollapply(
as.numeric(col)
,width=100
,by=1
,FUN="Omega"
)
)
}
)
,times=2L)


rollapply microbenchmark results

summaryexprminlqmedianuqmaxneval
1ra_matrix26.00176218226.00176218226.896608485527.79145478927.7914547892
2ra_matrix222.33145471222.33145471224.856880908527.38230710527.3823071052
3ra_matrix325.25507387425.25507387426.331003903527.40693393327.4069339332
4ra_xts308.954120943308.954120943309.3609896875309.767858432309.7678584322
5ra_xts227.15430681627.15430681628.630342135530.10637745530.1063774552


plot of chunk unnamed-chunk-4

2 comments:

  1. Hi, thanks for the great post. Perhaps you could comment on this? http://stackoverflow.com/questions/24647075/computing-a-weighted-rolling-average-r?noredirect=1#comment38206679_24647075

    ReplyDelete
  2. very interesting; thanks, I might just try it out. For now, I will definitely add in the body of the post, so folks are aware.

    ReplyDelete