Group Clustering Using Inter-Group Dissimilarities
Debessay Fesehaye Kassa, Lenin Singaravelu, Chien-Chia Chen, Xiaobo Huang, Amitabha Banerjee, Ruijin Zhou and Rajesh Somasundaran
VMware, Google, VMware, VMware, VMware, VMware, VMware

Various systems have natural groupings. For instance in large scale distributed system, we can have groups of virtual and/or physical devices. A system can also have groups of time series datasets collected at different time intervals. Such groups are usually characterized by multidimensional metrics (features) set. Clustering such groups using their multidimensional datasets has various applications such as identifying different performance levels for anomaly detection and load balancing. Traditional algorithms focus on clustering a single time series dataset and not on such groups with multidimensional metrics datasets. In this paper we present the design, implementation and analysis of two sets of group clustering algorithms. The first set is called one-to-many as it generates clusters of groups by comparing each group against all other groups. The second set of algorithms is called pairwise as it generates the clusters of groups using pairwise group dissimilarity matrix. Both sets of algorithms first generate group dissimilarity weights using metric ranking algorithms. We implemented the group clustering algorithms by extending a well known machine learning package and using a front-end visualizer.We validated the clustering algorithms using real world datasets on the VMware vSAN product. Experimental results show that 7 out of the 8 proposed algorithms can generate expected clusters in at least 4 out of the 6 detailed experiments. The 3 out of 8 proposed algorithms can generate the expected clusters in 5 out of the 6 experiments. One of the pairwise algorithms can generate the expected clusters in all 6 of the 6 experiments.