The volume of existing graph-structured
data requires improved parallel tools
and algorithms. Finding communities, smaller
subgraphs densely connected within the subgraph
than to the rest of the graph, plays a role both
in developing new parallel algorithms as well as
opening smaller portions of the data to current
analysis tools. We improve performance of our
parallel community detection algorithm by 20% on
the massively multithreaded Cray XMT, evaluate
its performance on the next-generation Cray XMT2,
and extend its reach to Intel-based platforms with
OpenMP. To our knowledge, not only is this the first
massively parallel community detection algorithm
but also the only such algorithm that achieves
excellent performance and good parallel scalability
across all these platforms. Our implementation
analyzes a moderate sized graph with 105 million
vertices and 3.3 billion edges in around 500 seconds
on a four processor, 80-logical-core Intel-based
system and 1100 seconds on a 64-processor Cray
XMT2.