2024-03-05

# Ego-Network segmentation via (Weighted) Jaccard Median

## Publication

### Publication

An ego-network is a graph representing the interactions of a node (<italic>ego</italic>) with its neighbors and the interactions among those neighbors. A sequence of ego-networks having the same ego can thus model the evolution of these interactions over time. We introduce the problem of segmenting a sequence of ego-networks into <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> segments, for any given integer <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>. Each segment is represented by a summary network, and the goal is to minimize the total loss of representing <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> segments by <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> summaries. The problem allows partitioning the sequence into homogeneous segments with respect to the activities or properties of the ego (e.g., to identify time periods when a user acquired different circles of friends in a social network) and to compactly represent each segment with a summary. The main challenge is to construct a summary that represents a collection of ego-networks with minimum loss. To address this challenge, we employ Jaccard Median (JM), a well-known NP-hard problem for summarizing sets, for which, however, no effective and efficient algorithms are known. We develop a series of algorithms for JM offering different effectiveness/efficiency trade-offs: (I) an exact exponential-time algorithm, based on Mixed Integer Linear Programming; (II) exact and approximation polynomial-time algorithms for minimizing an upper bound of the objective function of JM; and (III) efficient heuristics for JM, which are based on an effective scoring scheme and one of them also on sketching. We also study a generalization of the segmentation problem, in which there may be multiple edges between a pair of nodes in an ego-network. To tackle this problem, we develop a series of algorithms, based on a more general problem than JM, called Weighted Jaccard Median WJM: (I) an exact exponential-time algorithm, based on Mixed Integer Linear Programming; (II) exact algorithms for minimizing an upper bound of the objective function of WJM; and (III) efficient heuristics, based on the percentiles of edge multiplicities and one of them also on divide-and-conquer. By building upon the above results, we design algorithms for segmenting a sequence of ego-networks. Experiments with 10 real datasets and with synthetic datasets show that our algorithms produce optimal or near-optimal solutions to JM or to WJM, and that they substantially outperform state-of-the-art methods which can be employed for ego-network segmentation.

Additional Metadata | |
---|---|

, , , , , , , , , | |

doi.org/10.1109/TKDE.2024.3373712 | |

IEEE Transactions on Knowledge and Data Engineering | |

Organisation | Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands |

Zhong, H., Loukides, G., Conte, A., & Pissis, S. (2024). Ego-Network segmentation via (Weighted) Jaccard Median. IEEE Transactions on Knowledge and Data Engineering. doi:10.1109/TKDE.2024.3373712 |