Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians’ needs

Turner, Rosanne; Coenen, Femke; Roelofs, Femke; Hagoort, Karin; Härmä, Aki; Grünwald, Peter; Velders, Fleur; Scheepers, Floortje

doi:10.1186/s12888-022-04058-z

R.J. Turner (Rosanne), F. Coenen (Femke), F. Roelofs (Femke), K. Hagoort (Karin), A. Härmä (Aki), P.D. Grünwald (Peter), F.P. Velders (Fleur) and F.E. Scheepers (Floortje)

2022-06-17

Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians’ needs

BMC Psychiatry , Volume 22 p. 407:1- 407:11

Background: Developing predictive models for precision psychiatry is challenging because of unavailability of the necessary data: extracting useful information from existing electronic health record (EHR) data is not straightforward, and available clinical trial datasets are often not representative for heterogeneous patient groups. The aim of this study was constructing a natural language processing (NLP) pipeline that extracts variables for building predictive models from EHRs. We specifically tailor the pipeline for extracting information on outcomes of psychiatry treatment trajectories, applicable throughout the entire spectrum of mental health disorders (“transdiagnostic”). Methods: A qualitative study into beliefs of clinical staff on measuring treatment outcomes was conducted to construct a candidate list of variables to extract from the EHR. To investigate if the proposed variables are suitable for measuring treatment effects, resulting themes were compared to transdiagnostic outcome measures currently used in psychiatry research and compared to the HDRS (as a gold standard) through systematic review, resulting in an ideal set of variables. To extract these from EHR data, a semi-rule based NLP pipeline was constructed and tailored to the candidate variables using Prodigy. Classification accuracy and F1-scores were calculated and pipeline output was compared to HDRS scores using clinical notes from patients admitted in 2019 and 2020. Results: Analysis of 34 questionnaires answered by clinical staff resulted in four themes defining treatment outcomes: symptom reduction, general well-being, social functioning and personalization. Systematic review revealed 242 different transdiagnostic outcome measures, with the 36-item Short-Form Survey for quality of life (SF36) being used most consistently, showing substantial overlap with the themes from the qualitative study. Comparing SF36 to HDRS scores in 26 studies revealed moderate to good correlations (0.62—0.79) and good positive predictive values (0.75—0.88). The NLP pipeline developed with notes from 22,170 patients reached an accuracy of 95 to 99 percent (F1 scores: 0.38 – 0.86) on detecting these themes, evaluated on data from 361 patients. Conclusions: The NLP pipeline developed in this study extracts outcome measures from the EHR that cater specifically to the needs of clinical staff and align with outcome measures used to detect treatment effects in clinical trials.

Additional Metadata
Keywords	Transdiagnostic psychiatry, Natural language processing, Machine learning, Depression, Hamilton, SF-36
Stakeholder	Philips Research, Eindhoven, Netherlands
Persistent URL	doi.org/10.1186/s12888-022-04058-z
Journal	BMC Psychiatry
Project	Enabling Personalized Interventions
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/628.011.028 - Enabling Personalized Interventions
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Turner, R., Coenen, F., Roelofs, F., Hagoort, K., Härmä, A., Grünwald, P., … Scheepers, F. (2022). Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians’ needs. BMC Psychiatry, 22, 407:1–407:11. doi:10.1186/s12888-022-04058-z

View at Publisher

Free Full Text ( Final Version , 1mb )

See Also
software\|data psynlp --- fork of original package adapted for detecting transdiagnostic outcome measures in clinical psychiatry notes R.J. Turner (Rosanne)

Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians’ needs

Publication

Publication

software|data
psynlp --- fork of original package adapted for detecting transdiagnostic outcome measures in clinical psychiatry notes

Address

CWI researchers

Questions or comments?

Information extraction from free text for aiding transdiagnostic psychiatry: constructing NLP pipelines tailored to clinicians’ needs

Publication

Publication

software|data psynlp --- fork of original package adapted for detecting transdiagnostic outcome measures in clinical psychiatry notes

Workflow

Workflow

Add Content

software|data
psynlp --- fork of original package adapted for detecting transdiagnostic outcome measures in clinical psychiatry notes