Monday, August 1, 2011

Dividend Quartiles with Kenneth French Data

Based on my perception of the last 3 years, I would have expected high dividend stocks to have substantially underperformed low and zero dividend stocks.  Fortunately, just like with size and momentum in Beating Kenneth French Small – High, we can explore the datasets from 1927 with dividend quartiles to see how it looks over the long term.  I thank Kenneth French and R, since I can now disprove my perception.  Strangely the biggest discrepancy from the basic linear model is the high stocks relative to the mid dividend stocks.

For a lot of additional information on high yield stocks, see CSFB Global 2011 Yearbook written by Elroy Dimson, Paul Marsh, and Mike Staunton.

From TimelyPortfolio
From TimelyPortfolio
From TimelyPortfolio

R code (click to download):

#explore dividend data from the
#very helpful Ken French
#for this project we will look at Dividend Portfolios
#   require(PerformanceAnalytics)
require(ggplot2)   my.url=""
download.file(my.url, my.tempfile, method="auto",
quiet = FALSE, mode = "wb",cacheOK = TRUE)
#read space delimited text file extracted from zip
french_dividend <- read.table(file=my.usefile,
header = FALSE, sep = "", = TRUE,
skip = 20, nrows=1002)[,1:5]
colnames(french_dividend) <- c("date","Zero","Low","Mid","High")   #get dates ready for xts index
datestoformat <- french_dividend[,1]
datestoformat <- paste(substr(datestoformat,1,4),
substr(datestoformat,5,7),"01",sep="-")   #get xts for analysis
french_dividend_xts <- as.xts(french_dividend[,2:5],   french_dividend_xts <- french_dividend_xts/100   #jpeg(filename="performance summary of dividend portfolios.jpg",
# quality=100,width=6.25, height = 8, units="in",res=96)
main="Performance by Kenneth French Dividend
Monthly Since 1927"
side=1,adj=0,cex=0.75)   #get price series of small cap high momentum for system building
#by applying the cumprod function on each column
french_dividend_price <- as.xts(apply(french_dividend_xts[,1:4]+1,FUN="cumprod",MARGIN=2))
#if we wanted to compare to the average of all we could use this
#french_all_price <- cumprod(1+apply(coredata(french_dividend_xts[,c(1:4)]),MARGIN=1,FUN=mean))   #jpeg(filename="linear models of relative strength dividend.jpg",
# quality=100,width=6.25, height = 8, units="in",res=96)
for (i in 1:3) {
french_rs <- log(french_dividend_price[,4]/french_dividend_price[,i])
model <- lm(french_rs[,1]~index(french_rs))
main=paste("French High Dividend to ",colnames(french_dividend_xts)[i]," Dividend",sep=""))
side=1,adj=0,cex=0.75)   #jpeg(filename="linear residuals of relative strength dividend.jpg",
# quality=100,width=6.25, height = 8, units="in",res=96)
for (i in 1:3) {
french_rs <- log(french_dividend_price[,4]/french_dividend_price[,i])
model <- lm(french_rs[,1]~index(french_rs))
main=paste("Residuals French High Dividend to ",colnames(french_dividend_xts)[i]," Dividend",sep=""))
#another way to chart
# chartSeries(french_rs,theme="white") ,
# TA="addTA(as.xts(model$fitted.values[,1]),on=1)");addTA(runSum(as.xts(model$residuals[,1]),n=24))")             #######################################################
#for later possibly
#apply earlier charts and systems from momentum to the dividend data   #speedy solution for ranking from Charles Berry
#rolling rank of price over last 12 months (12 means max price for last 12)
#pad first 11 with NA
nper <- 10
x.rank <- c(rep(NA,nper-1),rowSums(coredata(french_high_price)[ -(1:(nper-1)) ] >= embed(french_high_price,nper)))
x.rank <- as.xts(x.rank,
signalRank <- ifelse(x.rank[,1] > 3 |
french_high_price >= runMax(french_high_price,2),1,0)
retRank <- lag(signalRank,k=1)*french_dividend_xts[,4]   #try rolling 10 month moving average popularized by Mebane Faber
signalAvg <- ifelse(french_high_price > runMean(french_high_price,n=10),1,0)
retAvg <- lag(signalAvg,k=1)*french_dividend_xts[,4]   #try RSI
signalRSI <- ifelse(RSI(french_high_price,n=4) > 45,1,0)
retRSI <- lag(signalRSI,k=1)*french_dividend_xts[,4]   retCompare <- merge(retRank,retAvg,retRSI,french_dividend_xts[,4])
colnames(retCompare) <- c("Small.High.Rank",
#jpeg(filename="performance of small-high with systems.jpg",quality=100,width=6.25, height = 8, units="in",res=96)
colorset = c("cadetblue","darkolivegreen3","purple","gray70"),
main="Kenneth French Small Size and High Momentum Stocks
Compared To Various Price Systems"
)   #jpeg(filename="capture of small-high with systems.jpg",quality=100,width=6.25, height = 8, units="in",res=96)
main="Kenneth French Small Size and High Momentum Stocks
Compared To Various Price Systems"
)   #get average of small size for additional system testing
french_dividend_avg <- cumprod(1+apply(coredata(french_dividend_xts[,c(1:4)]),MARGIN=1,FUN=mean))
french_dividend_avg <- as.xts(french_dividend_avg,
nper <- 10
x.rank <- c(rep(NA,nper-1),rowSums(coredata(french_dividend_avg)[ -(1:(nper-1)) ] >= embed(french_dividend_avg,nper)))
x.rank <- as.xts(x.rank,
signalRankAvg <- ifelse(x.rank[,1] > 3 |
french_dividend_avg >= runMax(french_dividend_avg,2),1,0)
retRankAvg <- lag(signalRankAvg,k=1)*french_dividend_xts[,4]   retCompare <- merge(retRank,retRankAvg,french_dividend_xts[,3])
colnames(retCompare) <- c("Small.High.Rank",
#jpeg(filename="performance of small-high with 2 rank systems.jpg",quality=100,width=6.25, height = 8, units="in",res=96)
colorset = c("cadetblue","darkolivegreen3","gray70"),
main="Kenneth French Small Size and High Momentum Stocks
Compared To Rank-based Price Systems"
)       chart.QQPlot(french_dividend_xts[,4],
main = "Normal Distribution", distribution = 'norm', envelope=0.95)
fit = fitdistr(1+french_dividend_xts[,3], 'lognormal')
chart.QQPlot(1+french_dividend_xts[,3], main = "Log-Normal Distribution", envelope=0.95, distribution='lnorm', meanlog = fit$estimate[[1]], sdlog = fit$estimate[[2]])
fit = st.mle(y=french_dividend_xts[,3])
chart.QQPlot(french_dividend_xts[,3], main = "Skew T Distribution", envelope=0.95, distribution = 'st', location = fit$dp[[1]], scale = fit$dp[[2]], shape = fit$dp[[3]], df=fit$dp[[4]])       chart.Histogram(french_dividend_xts[,4], methods = c( "add.density", "add.normal") )
chart.Histogram(french_dividend_xts[,1], methods = c( "add.centered", "add.density", "add.rug") )
chart.Histogram(french_dividend_xts[,3], methods = c( "add.centered", "add.density", "add.rug", "add.qqplot") )
chart.Histogram(french_dividend_xts[,3], methods = c("add.density", "add.centered", "add.rug", "add.risk") )     #interesting linearity between momentum
chart.Regression(french_dividend_xts[,4], apply(coredata(french_dividend_xts[,c(1:4)]),MARGIN=1,FUN=mean),
fit="conditional", main="Conditional Beta")

Created by Pretty R at


  1. I have mentioned before that I am a great fan. I wasnt sure exactly how to email you this request but I was hoping you could consider doing the following exercise when you are thinking of examples, I am a slow learning but passionate to learn R coder:

    Allow me to set the scene. A "fancy" quant outfit presented a black box trading system to me and my business partner. It essentially forecasts the next days behavior of the SPY using data mining techniques. It goes into the market 5 min before the close and executes the following days order. Simple!!

    The backtested results are just way too good to be true, and I have told them so. My partner on the other hand thinks they are the bomb. They are a group of PhD mathematicians with military inllegence background and they demonstrated all these fancy charts to him. My partner is a lawyer and not skilled in quantitative finance.

    Furthermore as I dont have access to their code I got them to open up a paper trading account with IB whereby I can monitor their performance. I have almost 3 months of "live" performance which I have analyzed and it doesnt match their backtested look and feel. E.g. there is already a large max draw down in 3 months of sim trading than over 5yrs of backtesting over the rockyest terrain the markets have probably ever seen. So this is another smelly alarm signal.

    What I would love to do in R is simply create a random sequence of daily long / short the market and apply it to the SPY. I guess the thing to do is then aggregate the results of a bunch of random time series to see the average cumulative return as the package (performance analytics) presents. My reasoning is that their system is probably no better than an random signal generator, despite their so called 90 different patterns of analysis.

    Bottom line is I am not expert but I am trying to show them and my business partner that their returns dont add up. A further more complicated study would be to take the daily time series of their 5yr track record and try and analyze if it has an chance of reality. I am happy to share the series.

    Regards and keep up the excellent posting.

  2. Fantastic real-life case study. Thanks for contributing. Pat at PortfolioProbe seems to be one of the experts on random portfolio generation. I have not played with it in a while, but I'll have a go at it. Unfortunately, my first try will probably be in Excel and then I'll get it into R. Is it short and long or long only?

    Congratulations on prudently evaluating and setting up the paper trading account. Your partner probably owes you big.

    5 years is nothing for me, but I know I am old-fashioned.

  3. It is long and short, how do I send you the file of the backtested performance, I can send it in Excel as well as an R file?

    Looking forward to your next post.

    Here is one method that has been suggested to me to check the backtested results against a paper trading account to check for similarity.

    ks.test {stats} R Documentation
    Kolmogorov-Smirnov Tests

    Performs one or two sample Kolmogorov-Smirnov tests.

    ks.test(x, y, ..., alternative = c("two.sided", "less", "greater"),
    exact = NULL)

    x a numeric vector of data values.
    y either a numeric vector of data values, or a character string naming a distribution function.
    ... parameters of the distribution specified (as a character string) by y.
    alternative indicates the alternative hypothesis and must be one of "two.sided" (default), "less", or "greater". You can specify just the initial letter of the value, but the argument name must be give in full. See Details for the meanings of the possible values.
    exact NULL or a logical indicating whether an exact p-value should be computed. See Details for the meaning of NULL. Not used for the one-sided two-sample case.

  4. Sorry I've had a crazy week even outside the market. Send to me at kent.russell [[at]) and I'll see how it looks. Thanks again so much for reading and commenting. I look forward to following you on Twitter.

  5. using this on a linux box, you have to change 2 lines: