hclust
Agglomerative Hierarchical Clustering. This method seeks to build a hierarchy of clusters in a bottom up approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. The results of hierarchical clustering are usually presented in a dendrogram.
In general, the merges are determined in a greedy manner. In order to decide which clusters should be combined, a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate metric, and a linkage criteria which specifies the dissimilarity of sets as a function of the pairwise distances of observations in the sets.
Hierarchical clustering has the distinct advantage that any valid measure of distance can be used. In fact, the observations themselves are not required: all that is used is a matrix of distances.
References
David Eppstein. Fast hierarchical clustering and other applications of dynamic closest pairs. SODA 1998.
Parameters
The data set.
the agglomeration method to merge clusters. This should be one of "single", "complete", "upgma", "upgmc", "wpgma", "wpgmc", and "ward".
Agglomerative Hierarchical Clustering. This method seeks to build a hierarchy of clusters in a bottom up approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. The results of hierarchical clustering are usually presented in a dendrogram.
In general, the merges are determined in a greedy manner. In order to decide which clusters should be combined, a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate metric, and a linkage criteria which specifies the dissimilarity of sets as a function of the pairwise distances of observations in the sets.
Hierarchical clustering has the distinct advantage that any valid measure of distance can be used. In fact, the observations themselves are not required: all that is used is a matrix of distances.
References
David Eppstein. Fast hierarchical clustering and other applications of dynamic closest pairs. SODA 1998.
Parameters
The data set.
the distance/dissimilarity measure.
the agglomeration method to merge clusters. This should be one of "single", "complete", "upgma", "upgmc", "wpgma", "wpgmc", and "ward".