Optimizing Source Selection in Social Sensing in the Presence of Influence Graphs
Huajie Shao, Shiguang Wang, Shen Li, Shuochao Yao, Yiran Zhao, Md Tanvir Al Amin, Tarek Abdelzaher and Lance Kaplan

This paper addresses the problem of choosing the right sources to solicit data from in sensing applications involving broadcast channels, such as those crowdsensing applications where sources share their observations on social media. The goal is to select sources such that expected fusion error is minimized. We assume that soliciting data from a source incurs a cost and that the cost budget is limited. Contrary to other formulations of this problem, we focus on the case where some sources influence others. Hence, asking a source to make a claim affects the behavior of other sources as well, according to an influence model. The paper makes two contributions. First, we develop an analytic model for estimating expected fusion error, given a particular influence graph and solution to the source selection problem. Second, we use that model to search for a solution that minimizes expected fusion error, formulating it as a zero-one integer nonlinear programming (INLP) problem. To scale the approach, the paper further proposes a novel reliability-based pruning heuristic (RPH) and a similarity-based lossy estimation (SLE) algorithm that significantly reduce the complexity of the INLP algorithm at the cost of a modest approximation. The analytically computed expected fusion error is validated using both simulations and realworld data from Twitter, demonstrating a good match between analytic predictions and empirical measurements. It is also shown that our method outperforms baselines in terms of resulting fusion error.