We introduce a weighted and unconstrained variant of the well-known minimum k union problem: Given a bipartite graph (U,V, E) with weights for all nodes in V, find a set S ⊆ V such that the ratio between the total weight of the nodes in S and the number of their distinct incident nodes in U is maximized. Our problem, which we term Heavy Nodes in a Small Neighborhood (HNSN), finds applications in marketing, team formation, and money laundering detection. For example, in the latter application, S represents bank account holders who obtain illicit money from some peers of a criminal and route it through their accounts to a target account belonging to the criminal. We prove that HNSN can be solved exactly in polynomial time via linear programming. As the size of can be very large in practice, we also develop a near linear-time greedy heuristic. In addition, we formalize a money laundering scenario involving multiple target accounts and show how our algorithms can be extended to deal with it. Our experiments on real and synthetic datasets show that our algorithms find optimal or near-optimal solutions, outperforming a natural baseline, and that they can detect money laundering much more effectively and efficiently than a state-of-the-art method.

, , ,
SIAM International Conference on Data Mining, SDM 2023
Networks and Optimization

Chen, H., Loukides, G., Gwadera, R., & Pissis, S. (2023). Heavy Nodes in a Small Neighborhood: Algorithms and applications. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM) (pp. 307–315). doi:10.1137/1.9781611977653.ch35