Topic modeling for conversations for mental health helplines with utterance embedding

Salmi, Salim; van der Mei, Rob; Mérelle, Saskia; Bhulai, Sandjai

doi:10.1016/j.teler.2024.100126

S. Salmi (Salim), R.D. van der Mei (Rob), S. Mérelle (Saskia) and S. Bhulai (Sandjai)

2024-03-01

Topic modeling for conversations for mental health helplines with utterance embedding

Telematics and Informatics Reports , Volume 13 p. 100126:1- 100126:7

Conversations with topics that are locally contextual often produces incoherent topic modeling results using standard methods. Splitting a conversation into its individual utterances makes it possible to avoid this problem. However, with increased data sparsity, different methods need to be considered. Baseline bag-of-word topic modeling methods for regular and short-text, as well as topic modeling methods using transformer-based sentence embeddings were implemented. These models were evaluated on topic coherence and word embedding similarity. Each method was trained using single utterances, segments of the conversation, and on the full conversation. The results showed that utterance-level and segment-level data combined with sentence embedding methods performs better compared to other non-sentence embedding methods or conversation-level data. Among the sentence embedding methods, clustering using HDBScan showed the best performance. We suspect that ignoring noisy utterances is the reason for better topic coherence and a relatively large improvement in topic word similarity.

Additional Metadata
Keywords	Bert, Conversations, Mental health, Sentence embedding, Topic modeling
Persistent URL	doi.org/10.1016/j.teler.2024.100126
Journal	Telematics and Informatics Reports
Project	Samenwerkingsovereenkomst Stichting 113 / CWI - Zelfmoordpreventie
Grant	This work was funded by the CWI PPS samenwerking; grant id pps/682 - Samenwerkingsovereenkomst Stichting 113 / CWI - Zelfmoordpreventie
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Salmi, S., van der Mei, R., Mérelle, S., & Bhulai, S. (2024). Topic modeling for conversations for mental health helplines with utterance embedding. Telematics and Informatics Reports, 13, 100126:1–100126:7. doi:10.1016/j.teler.2024.100126

View at Publisher

Free Full Text ( Final Version , 535kb )

Topic modeling for conversations for mental health helplines with utterance embedding

Publication

Publication

Address

CWI researchers

Questions or comments?

Topic modeling for conversations for mental health helplines with utterance embedding

Publication

Publication

Workflow

Workflow

Add Content