VLFeat includes a basic implementation of k-means clustering and hierarchical k-means clustering. They are designed to be lightweight in order to work on large dataset. In particular, they assume that the data are vectors of usigned chars (one byte). While this is limiting for some application, works well for clustering image descriptors, where usually very high precision is unnecessary.
Integer k-means (IKM) is run by the command ikmeans
. In order to
demonstrate the usage of this command, we sample 10000 random points
in the $[0,255]^2$ integer square and launch the ikmeans
to get
k=3 clusters:
K = 3 ; data = uint8(rand(2,10000) * 255) ; [C,A] = ikmeans(data,K) ;
The program returns both the cluster centers C
and the
data-to-cluster assignments A
. By means of the cluster
centers
C
we can project more data on the same clusers
datat = uint8(rand(2,100000) * 255) ; AT = ikmeanspush(datat,C) ;
In order to visualize the results, we associate to each cluster a color and we plot the points:
cl = get(gca,'ColorOrder') ; ncl = size(cl,1) ; for k=1:K sel = find(A == k) ; selt = find(AT == k) ; plot(data(1,sel), data(2,sel), '.',... 'Color',cl(mod(k,ncl)+1,:)) ; plot(datat(1,selt),datat(2,selt),'+',... 'Color',cl(mod(k,ncl)+1,:)) ; end