Wednesday, July 25, 2012

Inspirational Stack Overflow Dendrogram Applied to Currencies

When I saw the answer to this Stack Overflow question, I immediately remembered working on my old post Clustering with Currencies and Fidelity Funds and just had to try to apply this technique.  As I should have guessed, it worked with only a minimal amount of changes.  Hoping to incrementally improve, I added a couple of slight modifications.

From TimelyPortfolio

R code from GIST (select raw to copy/paste):

require(quantmod)
require(fAssets)
#get asian currency data from the FED FRED data series
getSymbols("DEXKOUS",src="FRED") #load Korea
getSymbols("DEXMAUS",src="FRED") #load Malaysia
getSymbols("DEXSIUS",src="FRED") #load Singapore
getSymbols("DEXTAUS",src="FRED") #load Taiwan
getSymbols("DEXCHUS",src="FRED") #load China
getSymbols("DEXJPUS",src="FRED") #load Japan
getSymbols("DEXTHUS",src="FRED") #load Thailand
getSymbols("DEXBZUS",src="FRED") #load Brazil
getSymbols("DEXMXUS",src="FRED") #load Mexico
getSymbols("DEXINUS",src="FRED") #load India
getSymbols("DTWEXO",src="FRED") #load US Dollar Other Trading Partners
getSymbols("DTWEXB",src="FRED") #load US Dollar Broad
currencies<-merge(DEXKOUS,DEXMAUS,DEXSIUS,DEXTAUS,DEXCHUS,DEXJPUS,DEXTHUS,DEXBZUS,DEXMXUS,DEXINUS,DTWEXO,DTWEXB)
currencies<-na.omit(currencies)
currencies<-currencies/lag(currencies)-1
# try to do http://stackoverflow.com/questions/9747426/how-can-i-produce-plots-like-this
# Sample data
n <- NROW(currencies)
k <- NCOL(currencies)
d <- as.matrix(na.omit(currencies))
x <- apply(d+1,2,cumprod)
t <- assetsDendrogramPlot(as.timeSeries(currencies))
r <- t$hclust
# Plot
op <- par(mar=c(0,0,0,0),oma=c(0,2,0,0))
# set up plot area for the dendrogram
plot(NA,ylim=c(.5,k+.5), xlim=c(0,4),axes=FALSE)
# Dendogram. See ?hclust for details.
xc <- yc <- rep(NA,k)
o <- 1:k
o[r$order] <- 1:k
#separate into 4 groups for color classification
groups <- cutree(r, k=4)[r$order]
# loop through each to generate the dendrogram
# go from innermost to outermost
for(i in 1:(k-1)) {
a <- r$merge[i,1]
x1 <- if( a<0 ) o[-a] else xc[a]
y1 <- if( a<0 ) 0 else yc[a]
b <- r$merge[i,2]
x2 <- if( b<0 ) o[-b] else xc[b]
y2 <- if( b<0 ) 0 else yc[b]
#do the lines for the dendrogram
lines(
3+c(y1,i,i,y2)/k,
c(x1,x1,x2,x2),
lwd=k-i,
col=groups[colnames(d)[abs(a)]]
)
xc[i] <- (x1+x2)/2
yc[i] <- i
}
# Time series
axis(2,1:k,colnames(d)[r$order],las=0, cex.axis=0.6, line=-1, lwd=0, lwd.ticks=1)
u <- par()$usr
for(i in 1:k) {
f <- c(0,3,i-.5,i+.5)
f <- c(
(f[1]-u[1])/(u[2]-u[1]),
(f[2]-u[1])/(u[2]-u[1]),
(f[3]-u[3])/(u[4]-u[3]),
(f[4]-u[3])/(u[4]-u[3])
)
par(new=TRUE,fig=f)
plot(x[,r$order[i]],axes=FALSE,xlab="",ylab="",main="",type="l",col=groups[i],lwd=2)
box()
}
par(op)
view raw currencyplot.r hosted with ❤ by GitHub

1 comment:

  1. Interesting post. Thank you. I played a bit with your code, but there's just one thing that puzzles me is that this method aims at clustering similar series, but observing the results visually I couldnt find any similarities. e.g. why China series are similar to US, if visually India is much closer. or why Singapore and Taiwan are not in one group.

    It seems to me that this is a drawback of the method itself, not visualization.

    ReplyDelete