Show me your chest x-ray and I'll tell you your mortality risk

In a cooperation between the University of Applied Sciences Stralsund and Harvard University, scientists have developed an algorithm for mortality prognosis based on thoracic x-ray images.

Respiratory System & Pneumology
By Dr. Hubertus Glaser und Dr. Jörg Zorn

In a cooperation between the University of Applied Sciences Stralsund and Harvard University, scientists have developed an algorithm for mortality prognosis based on thoracic x-ray images.

Today I focus on a question that springs from the overlap between AI (artificial intelligence) and medicine: What is a convolutional neural network?

If you are not working on a daily fashion with such a topic, as a few of the more avant-garde physicians of present-day do, you might need a moment. In the meantime, let's start with something a tad more conventional: the thoracic x-ray.

The conventional x-ray thorax still leads the way

This diagnostic technique was discovered by the German physician (and mechanical engineer) Willhelm Röntgen, who first tested this innovation on November 8, 1895. Even today, at the beginning of the third millennium, conventional x-ray diagnostics are still the imaging technique most frequently used to examine the human body. It is of particular importance for examining the lungs.

Just as a side comment, and trivia to highlight the importance of this technique, the German Federal Office for Radiation Protection had recently issued a few numbers related to x-ray use in the country. Of particular interest, one of their reports indicates that in 2015 alone, there were an estimated 135 million x-rays used on patients in Germany. Of these, around 40% (the largest share) came from dentistry, followed by examinations of the skeletal system and then the thorax. In the German case, the reports also indicate that the frequency of X-ray examinations remained almost constant between 2007 and 2015 at 1.7 per inhabitant per year. While the volume of conventional x-ray diagnostics decreased in the period under review, CT examinations increased by about 40% and radiation-free MRI by as much as 60%.

X-rays, soon a gold mine?

We live in times of accelerated change, in which we see a juxtaposition of familiar, conventional practices and ultra-new technologies. Most predominantly, and also pertaining to the medical fields we see the trend of digitalization and its technological advancements such as big data, machine learning, and AI. These new technologies, however, do not seem, at least in the near future, to be disruptive enough that it will bring about our redundancy as physicians...and as humans. For several decades, these new technologies, seem to need to coexist with technologies that despite their age, are quintessential to the medical practice. 

In this context, the thoracic x-ray may have as well become a kind of medical gold mine at the beginning of the AI age. Its treasure lies in the amount of prognostic information that lies dormant in the routinely produced images, and which could be used in the future with the use of algorithms and at a low cost.

Neural networks and deep learning: familiar catchwords...that means what?

We can now mention the convolutional neural network. We could not draft ourselves a more concise and clear answer than the one we found on Wikipedia: "A convolutional neural network (CNN or ConvNet) is an artificial neural network. It is a concept inspired by biological processes in the field of machine learning. Convolutional neural networks are used in numerous modern artificial intelligence technologies, primarily for the machine processing of image or audio data.”

Basically, the structure of a classical convolutional neural network consists of one or more convolutional layers, followed by a pooling layer. In principle, this unit can be repeated as often as desired. With sufficient repetitions, one speaks of Deep Convolutional Neural Networks, which fall into the area of Deep Learning.

The architectural comparison with a multi-layer perceptron is rather useful and we prefer to quote the definition of deep learning because we read it all the time: "Deep learning is a class of machine learning algorithms that uses multiple layers to progressively extract higher-level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces".

The news: "Determining the Probability of Dying by Artificial Intelligence"

And after this crash introduction to a few necessary concepts, we arrive at the news in question. Researchers at the Stralsund University, in collaboration with Harvard University, have released a report about a prognostic study1. In it, the participating scientists have created an artificial neural network, more specifically a CNN, which can independently evaluate the image data of thoracic x-ray images and make a prediction on long-term mortality.

The researchers used data from x-ray screening arms of two large clinical trials in order to develop a CNN called CXR-risk, which was employed for stratification according to overall mortality risk:

The AI was trained with over 85,000 images (initial and first follow-up examination) and follow-up material from almost 42,000 PLCO participants. For internal validation (20% random sample with over 10,000 PLCO participants) and external testing (NLST), only initial images were used to map the later case evolution.

Image data indicate a mortality risk of up to 53% over 12 years

With the algorithmic information gain, achieved exclusively on the basis of the image data of a single thorax x-ray image, a risk assessment (CXR risk score) can now be carried out with regard to long-term survival. The algorithm requires less than half a second for this. Existing x-ray images can be evaluated at little or no cost with regard to the probability of death, according to the press release issued by the Stralsund University of Applied Sciences.

Using the CXR risk score, the scientists stratified the data in quintiles:

In the unadjusted analysis, this meant an 18-fold (PLCO) or 15-fold (NSLT) increase in risk (hazard ratio, HR) in the highest risk class compared to the lowest. Even after adjustment for radiological findings and risk factors, the correlation proved to be robust (HR 5 for PLCO and HR 7 for NLST). In addition to overall mortality, similar associations were found for the following three causes of death:

In the PCLO data, the most frequent cause of death was cardiovascular disease, in which 4.1% of the participants died; in the NLST data set, lung cancer was the most frequent cause (2.1%).

Co-author Prof. Thomas Mayrhofer from the Stralsund University of Applied Sciences assumes that knowledge about the individualized mortality risk can be used to make informed decisions. The authors hope that the AI score could also motivate high-risk individuals to counteract premature death through preventive measures, regular screening participation, and lifestyle interventions.

Impact of these findings on the future of the medical practice

It is quite conceivable that routine diagnostic images could easily be uploaded to special AI risk analysis websites, according to a commentary on this study we found on medscape.com. Main author Dr. Michael LU from the Harvard Medical School, however, is cautious: "The technology is there, but we need clinical trials that prove that this information actually helps with decision-making and that it is able to improve health.

He also notes that it is not clear how many patients would actually want to know about the risks and chances of their own 12-year mortality risk.

Two invited to comment2 on the study had some words of concern. They do not yet see the prevention of unwanted outcomes to be achieved by AI in the near future and question the value of AI-mediated information if it is not (yet) clear what can be done with it or what a worthwhile prevention strategy could be. For them, the study demonstrates - despite the undoubted potential of deep learning for clinical assessment and care - "the gap between the development of a scientifically flawless algorithm and its meaningful application in real life".

Sources:
1. Lu MT et al Deep Learning to Assess Long-term Mortality From Chest Radiographs. JAMA Netw Open 2019;(7):e197416. doi:10.1001/jamanetworkopen.2019.7416
2. Tsega S, Cho HJ. Prediction and Prevention Using Deep Learning.  JAMA Netw Open 2019;2(7):e197447. doi:10.1001/jamanetworkopen.2019.7447