The Wharton School, Department of Statistics
University of Pennsylvania
Optimal Rates for Community Estimation in the Weighted Stochastic Block Model
Community identification in a network is an important problem in fields such as social science, neuroscience, and genetics. Over the past decade, stochastic block models (SBMs) have emerged as a popular statistical framework for this problem. However, SBMs have an important limitation in that they are suited only for networks with unweighted edges; disregarding the edge weights may result in a loss of valuable information in various scientific applications. We propose a weighted generalization of the SBM where observations comprise of a weighted adjacency matrix where the edges are generated from a SBM and where the weight of each edge is generated independently from one of two unknown probability densities depending on whether the edge is within-community or between-community. We characterize the optimal rate of mis-clustering error of the weighted SBM in terms of the Renyi divergence order 1/2 between the edge weight distributions of within-community and between-community edges, substantially generalizing existing results for unweighted SBMs. Furthermore, we present a computationally tractable algorithm that is adaptive to the unknown edge weight probability distributions in the sense that it achieves the same optimal error rate as if it had perfect knowledge of the probability distributions of the edge weights.
Friday, 1/26/2018, BLOC 113, 11:30 AM