R - cluster analysis on binary weblog data -
i have web data looks similar sample below. has user , binary value whether user cliked on particular link within website. wanted clustering of data. main goal find similar users based on online behaviour. clustering alorithm this? have tried k-means not work binary data. have tried spherical k-means skmeans()
. wanted sum of squared error scree plot, not figure out how sse skmeans.
user link1 link2 link3 link4 abc1 0 1 1 1 abc2 1 0 1 0 abc3 0 1 1 1 abc4 1 0 1 0
you try hierarchical clustering using binary distance measure jaccard, if "clicked link" asymmetrical:
dat <- read.table(header = true, row.names = 1, text = "user link1 link2 link3 link4 abc1 0 1 1 1 abc2 1 0 1 0 abc3 0 1 1 1 abc4 1 0 1 0") d <- dist(dat, method = "binary") hc <- hclust(d) plot(hc)
(clusters <- cutree(hc, k = 2)) # abc1 abc2 abc3 abc4 # 1 2 1 2
Comments
Post a Comment