Machine learning, within the spectrum of artificial intelligence, is emerging as a key tool for the development of technology associated with medicine. The handling of large volumes of data could favor the issuance of more accurate diagnoses, based on the antecedents gathered from the history of other archived cases.
During the COVID-19 pandemic, many technological initiatives that use this mechanism have emerged with the purpose of collaborating in their fight. However, according to a recent study, these technologies have not lived up to the clinical requirements under which cases of this magnitude should be treated.
Machine learning to combat COVID-19, under the magnifying glass
Hundreds of COVID-19 machine learning models presented in scientific papers during 2020 were subjected to a study led by researchers from the University of Cambridge. According to their conclusions, none of them met the requirements to efficiently detect or diagnose contagion cases, influencing this by the presence of errors in the methodology, poorly prepared data samples, biases and lack of reproducibility.
All these machine learning models, originally presented as solutions for the diagnosis of contagion cases or for the elaboration of forecasts, were subjected to a detailed review through their respective scientific manuscripts, published between January 1 and October 3. 2020. Some of the models that make up this sample claimed to be able to work from chest radiographs and computerized tomography images. Formally, very few documents had been subjected to a review process by other professionals, according to what was reported by the study.
Specifically, only 62 studies were validated through some review system, but none reached potential for clinical use. This sample was obtained after applying quality filters on 415 preselected studies, among 2,212 initially identified.
A technology with a future in medicine
The researchers concluded that machine learning has the potential to become a powerful tool in fighting the pandemic. To date there are advances, but due to the problems detected, they state that there is still a way to go to obtain a reliable tool under clinical criteria.
Among the main weaknesses that can be addressed today, the Cambridge team raised its alarm against the naive use of public data sets, which can lead to a significant risk of bias in the analyzes. Under that same point, they also emphasize that the data sample must be diverse and large enough, contemplating independent external data sets, in order for the model to be useful for different demographic groups.
The emergency inherent in this pandemic led to the emergence of quick solutions. However, in aspects as delicate as scientific and clinical work, aspects such as the meticulousness used when collecting information and the impartiality under which initiatives that have the vocation of being universal in character must not be left to chance.
And although quality data samples are part of the pillars of a successful formula, the reports that testify to this eventual progress must contain sufficient documentation to be reproducible and have external validation. Only in this way, according to what was proposed by the research group, can clinical trials be carried out with models that guarantee profitability, technical feasibility and, of course, the necessary clinical validity.