A Rapid, Accurate and Machine-Agnostic Segmentation and Quantification Method for CT-Based COVID-19 Diagnosis

A Rapid, Accurate and Machine-Agnostic Segmentation and Quantification Method for CT-Based COVID-19 Diagnosis

Table of Contents




Abstract

COVID-19 has caused a global pandemic and become the most urgent threat to the entire world. Tremendous efforts and resources have been invested in developing diagnosis, prognosis and treatment strategies to combat the disease. Although nucleic acid detection has been mainly used as the gold standard to confirm this RNA virus-based disease, it has been shown that such a strategy has a high false negative rate, especially for patients in the early stage, and thus CT imaging has been applied as a major diagnostic modality in confirming positive COVID-19. Despite the various, urgent advances in developing artificial intelligence (AI)-based computer-aided systems for CT-based COVID-19 diagnosis, most of the existing methods can only perform classification, whereas the state-of-the-art segmentation method requires a high level of human intervention. In this paper, we propose a fully-automatic, rapid, accurate, and machine-agnostic method that can segment and quantify the infection regions on CT scans from different sources. Our method is founded upon two innovations: 1) the first CT scan simulator for COVID-19, by fitting the dynamic change of real patients’ data measured at different time points, which greatly alleviates the data scarcity issue; and 2) a novel deep learning algorithm to solve the large-scene-small-object problem, which decomposes the 3D segmentation problem into three 2D ones, and thus reduces the model complexity by an order of magnitude and, at the same time, significantly improves the segmentation accuracy. Comprehensive experimental results over multi-country, multi-hospital, and multi-machine datasets demonstrate the superior performance of our method over the existing ones and suggest its important application value in combating the disease.

  • Author Keywords

    • COVID-19 ,
    • deep learning ,
    • segmentation ,
    • computerized tomography
  • IEEE Keywords

    • Computed tomography ,
    • Solid modeling ,
    • Lung ,
    • Three-dimensional displays ,
    • Image segmentation ,
    • COVID-19

Introduction

COVID-19, the infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has become a global pandemic and the most urgent threat facing our entire species. It also posed a grand challenge to the scientific community to cope with the dire need for sensitive, accurate, rapid, affordable, and simple diagnostic technologies.

SARS-CoV-2 is an RNA virus and belongs to a broad family of viruses known as coronaviruses. It consists of a positive-sense single-stranded RNA, and four main structural proteins, including the spike (S) proteins, the envelope (E) proteins, the membrane (M) proteins, and the nucleocapsid (N) proteins. Accordingly, there are two ways to detect the virus from patients’ samples: through the detection of the nucleic acids of the viru’ RNA or through the detection of the antibodies produced by the patients’ immune system. Therefore, in the latest guideline of Diagnosis and Treatment of Pneumonitis Caused by COVID-19 (the seventh version) published by the Chinese government, the diagnosis of COVID-19 must be confirmed by either the reverse transcription-polymerase chain reaction (RT-PCR) or by gene sequencing.

However, due to the practical issues in sample collection and transportation, as well as the performance of the testing kits, especially at the initial presentation of the outbreak, such gold standards have been shown to have a high false-negative rate. For example, among the 1014 COVID-19 patients in Wuhan up to February 6, 2020 [1], only 59% (601 out of 1014) had positive RT-PCR results, whereas 88% (888 out of 1014) had positive chest computerized tomography (CT) scans. Among the ones (601) with positive RT-PCR, CT scan also achieved a 97% sensitivity (580 out of 601). This suggests that CT scans can not only detect most of the positive ones by RT-PCR but also detect a lot more cases (about 30% more in [1]).

Therefore, CT scans have been widely used in many countries and have particularly shown great success in China as one of the main diagnostic standards for COVID-19.

Conclusion

In this work, we proposed a preprocessing method to cast any lung CT scan into a machine-agnostic standard embedding space. We developed a highly accurate segmentation model on the standard embedding space. To train the model, we further designed a novel simulation model to depict the dynamic change of infection regions for COVID-19 and used this dynamic model to augment extra data, which improved the performance of our segmentation model.

The preprocessing method resolves the heterogeneity issue in the data and makes our method applicable to any dataset generated by any CT machine. The segmentation model finds a good tradeoff between the complexity of the deep learning model and the accuracy of the model. In addition, it indirectly captures and incorporates the regular morphologies of lung tissues, such as lung lobes, pulmonary arteries, veins, and capillaries. This makes our model both accurate and rapid. Interestingly, we noticed that our model can sometimes outperform human annotations when distinguishing tracheae and blood vessels. We used a similar segmentation idea for a recent project on segmenting breast tumors from DCE-MRI images. The two studies thus suggested that this idea could be a generic approach for many biomedical imaging tasks, which requires further investigation and confirmation. The simulation model resolves the commonly-seen data scarcity issue for biomedical imaging tasks, particularly for COVID-19, where high-quality, annotated data are rarely accessible or available. These three cornerstones contribute together to the success of our method.

The comprehensive experiments on multi-country, multi-hospital, and multi-machine datasets showed that our segmentation model has much higher dice, recall, and worst-case performance, and runs much faster than the state-of-the-art methods. Our model thus provides a fully-automatic, accurate, rapid, and machine-agnostic tool to meet the urgent clinical needs to combat COVID-19.

There are three main directions to further improve our method. The first is to develop a federated learning platform. During our data collection process, we noticed that many hospitals have COVID-19 patients’ data but due to various reasons, they are not allowed to share the data with outside researchers. Thus, federated learning is an ideal solution to this situation, where we can train the model across different hospitals while each of them holds their own data and no data exchange is required. The second one is to further increase the size of the dataset. Despite the efforts in collecting heterogeneous data and developing a preprocessing approach, our current dataset size is still limited. More data will bring more information and thus lead to better models, which is our ongoing work. The third one is to incorporate orthogonal sources of information to the model, such as big epidemiology data, so that ambiguous cases can be better diagnosed, and the source and spread of the cases can be better traced. When the outbreak ends, such a multimodal learning platform can be used as a long-term warning system to serve as a ‘whistleblower’ to the future coronavirus yet to come.

Acknowledgment

The authors would like to thank Jiayu Zang, Weihang Song, Fengyao Zhu and Yi Zhao for their help on data preparation, annotation and transfer.

About KSRA

The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.

KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.

Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.

The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.

FULL Paper PDF file:

A Rapid, Accurate and Machine-AgnosticSegmentation and Quantification Methodfor CT-Based COVID-19 Diagnosis

Bibliography

author

L. Zhou et al.,

Year

2020

Title

A Rapid, Accurate and Machine-Agnostic Segmentation and Quantification Method for CT-Based COVID-19 Diagnosis

Publish in

in IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2638-2652, Aug. 2020

Doi

10.1109/TMI.2020.3001810.

PDF reference and original file: Click here

 

+ posts

Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.

Website | + posts

Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.

Website | + posts

Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.