Identifying Reddit Users at a High Risk of Suicide and Their Linguistic Features During the COVID-19 Pandemic: Growth-Based Trajectory Model

Yifei Yan, Jun Li, Xingyun Liu, Qing Li, Nancy Xiaonan Yu*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

2 Citations (Scopus)
48 Downloads (CityUHK Scholars)

Abstract

Background: Suicide has emerged as a critical public health concern during the COVID-19 pandemic. With social distancing measures in place, social media has become a significant platform for individuals expressing suicidal thoughts and behaviors. However, existing studies on suicide using social media data often overlook the diversity among users and the temporal dynamics of suicide risk.

Objective: By examining the variations in post volume trajectories among users on the r/SuicideWatch subreddit during the COVID-19 pandemic, this study aims to investigate the heterogeneous patterns of change in suicide risk to help identify social media users at high risk of suicide. We also characterized their linguistic features before and during the pandemic.

Methods: We collected and analyzed post data every 6 months from March 2019 to August 2022 for users on the r/SuicideWatch subreddit (N=6163). A growth-based trajectory model was then used to investigate the trajectories of post volume to identify patterns of change in suicide risk during the pandemic. Trends in linguistic features within posts were also charted and compared, and linguistic markers were identified across the trajectory groups using regression analysis.

Results: We identified 2 distinct trajectories of post volume among r/SuicideWatch subreddit users. A small proportion of users (744/6163, 12.07%) was labeled as having a high risk of suicide, showing a sharp and lasting increase in post volume during the pandemic. By contrast, most users (5419/6163, 87.93%) were categorized as being at low risk of suicide, with a consistently low and mild increase in post volume during the pandemic. In terms of the frequency of most linguistic features, both groups showed increases at the initial stage of the pandemic. Subsequently, the rising trend continued in the high-risk group before declining, while the low-risk group showed an immediate decrease. One year after the pandemic outbreak, the 2 groups exhibited differences in their use of words related to the categories of personal pronouns; affective, social, cognitive, and biological processes; drives; relativity; time orientations; and personal concerns. In particular, the high-risk group was discriminant in using words related to anger (odds ratio [OR] 3.23, P<.001), sadness (OR 3.23, P<.001), health (OR 2.56, P=.005), achievement (OR 1.67, P=.049), motion (OR 4.17, P<.001), future focus (OR 2.86, P<.001), and death (OR 4.35, P<.001) during this stage.

Conclusions: Based on the 2 identified trajectories of post volume during the pandemic, this study divided users on the r/SuicideWatch subreddit into suicide high- and low-risk groups. Our findings indicated heterogeneous patterns of change in suicide risk in response to the pandemic. The high-risk group also demonstrated distinct linguistic features. We recommend conducting real-time surveillance of suicide risk using social media data during future public health crises to provide timely support to individuals at potentially high risk of suicide.

©Yifei Yan, Jun Li, Xingyun Liu, Qing Li, Nancy Xiaonan Yu.
Original languageEnglish
Article numbere48907
JournalJournal of Medical Internet Research
Volume26
Online published8 Aug 2024
DOIs
Publication statusPublished - 2024

Funding

The study was sponsored by the Research Grants Council of the Hong Kong Special Administrative Region, China (Collaborative Research Fund, Project No. C1031-18G). The sponsors had no further role in study design, in the collection, analysis, and interpretation of data, in the writing of the report, and in the decision to submit the article for publication.

Research Keywords

  • COVID-19 pandemic
  • Reddit
  • suicide risk
  • trajectory

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Identifying Reddit Users at a High Risk of Suicide and Their Linguistic Features During the COVID-19 Pandemic: Growth-Based Trajectory Model'. Together they form a unique fingerprint.

Cite this