Understanding Cohesion in Writings and Speech of Schizophrenia Patients

Understanding Cohesion in Writings and Speech ofSchizophrenia Patients

Table of Contents


Schizophrenia is one of the mental disorders that impacts a person’s thinking, speech, and actions. It can reduce a person’s ability to process auditory information and make decisions. Analyzing this disorder correctly is important because it might help with different ways of reducing its negative effects on its patients. Linguists and psychiatrists have been investigating language impairments and speech disorder in people with schizophrenia disorder which can be challenging. In this study, we attempt to address this issue by analyzing linguistic features i.e. cohesion in the writings and speech scripts of schizophrenia patients. Our results show that using referential cohesion with text erasability or situation model features provides the best performance for speech whereas for writing dataset, readability, or a combination of situation model and readabilityyield the best performance.

  • Author Keywords

    • Schizophrenia,
    • Machine-Learning-Algorithms,
    • Binary-Classification,
    • Coherence,
    • Cohesion,
    • Coh-Metrix
  • IEEE Keywords

    • Linguistics,
    • Writing,
    • Coherence,
    • Tools,
    • Mental disorders,
    • Syntactics,
    • Standards


Schizophrenia is a psychotic disorder where the main symptom is that one has an impaired perception of reality [1]. It impairs the normal functioning of the brain in such a way that the manner in which an individual thinks, expresses himself, or herself or relates with others becomes distorted [2]. Furthermore, it can significantly impair functional abilities such as learning ability and social interactions with others[3].

Currently, more than 21 million people globally, suffer from Schizophrenia [4], and there is a need for a deeper understanding of its conditions. This could be critical in not only assessing the patients but also in identifying them so that they can receive the appropriate medical care in a timely manner.

Language can play a crucial role in identifying someone’s mental illness [5]. Previous studies have shown how language can help in diagnosing and predicting mental illness e.g.identify people who suffer from: depression and anxiety [6]–[10], Alzheimer’s [11]–[13], post-traumatic stress disorder(PTSD) [14], or schizophrenia [15]–[17]. Specifically for schizophrenia, there can be impaired coherence and overall lack of contextual structure [18]. Hence, in this work, we investigate linguistic features related to cohesion for two datasets (1) recorded and transcribed speech; and (2) written essays, with the end goal of identifying and classifying patients with schizophrenia. For this purpose, we trained two machine learning models, namely Support Vector Machine (SVM)and Random Forests (RF) to classify patients and controls. Our results show that among all cohesion features, situation model and readability performed the best for writing dataset and combination of referential cohesion, text erasability, and situation model for speech.


Patients with schizophrenia have different cognitive symptoms, some of which involve problems with concentration and memory, which in return may lead to disorganization in speech or behavior. Diagnosing this disorder early and correctly is extremely important as it may help alleviate the negative effects on its patients. Even though previous works have investigated language impairments and speech disorder in people with schizophrenia disorder, the availability of recordings of spoken language, as well as writings, provides an opportunity to systematically analyze the language used by patients. Among the linguistic features of cohesion that were investigated for this study, we found that a combination of features such as referential cohesion, text erasability, and situational model features provide the biggest boost in classification performance for LabSpeech dataset. For LabWriting dataset, readability and situation model for SVM performs the best performance, and a combination of features such as RC andRead* for RF have the best performance. In the future, we will explore other features of cohesion such as connectives, which create cohesive connections between ideas and clauses and show how the text is organized. We also plan to collect more data from social media such as Reddit fora similar analysis in this study. Finally, we plan to expand our analysis to other related mental health disorders.


We would like to thank the Coh-Metrix team for granting access to the tool and providing valuable support to us using it for the analyses performed in this study. We also would like to thank Michael Compton for granting access to Writing and speech datasets.

About KSRA

The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.

KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.

Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.

The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.

FULL Paper PDF file:

Understanding Cohesion in Writings and Speech ofSchizophrenia Patients



A. AlQahtani, E. Kayi, and M. Diab,




Understanding Cohesion in Writings and Speech of schizophrenia Patients

Publish in

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), Boca Raton, FL, USA, 2019, pp. 364-369,



PDF reference and original file: Click here

+ posts

Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.

Website | + posts

Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.

Website | + posts

Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.