When scientists carry out research on a given topic, they often start by reviewing previous study findings. Conducting systematic literature reviews or meta-analyses can be very challenging and time-consuming, as there are often huge amounts of research focusing on different topics, which may not always be relevant to a researcher’s work.
Researchers have recently developed a machine learning framework that could significantly speed up this process, by automatically browsing through numerous past studies and compiling high-quality literature reviews. This framework, called ASReview, could prove particularly useful for conducting research during the COVID-19 pandemic.
Researchers and experts face a major challenge to stay up-to-date with the latest developments in their field nowadays. Reading all the new literature in their field is a very time-consuming task, especially when you want to do this systematically. Those systematic ways of reading literature, called systematic reviews, often lead to impactful scientific publications because they are exhaustive summaries of current evidence.
One of the researchers who developed ASReview, has carried out several literature reviews throughout his academic career and he was thus well aware of how time-consuming the review process can be. In collaboration with experts in machine learning, engineering, and information management at Utrecht University, he set out to develop a tool that would significantly speed up the process of conducting systematic reviews and meta-analyses.
The machine learning framework is optimized to find a metaphorical ‘needle’ or multiple ‘needles’ in a haystack. As scientists conduct large amounts of research about a variety of different topics, automatically identifying the most relevant studies about a given topic can be highly valuable. To do this they trained their machine learning model using an interactive approach called active learning.
In classical review processes, a researcher is manually presented with an article and needs to decide whether it is relevant or not, and one generally continues exploring until he/she viewed all relevant articles. The challenge for our machine learning framework is to minimize the number of irrelevant articles shown to the researcher, which can save a lot of time in the literature review process.
Most existing machine learning systems are trained to accurately classify individual images, texts, or other data (i.e., to place data in different categories based on their features). In contrast, the system was trained to analyze several documents and determine which ones are relevant to a given research topic and which ones are irrelevant.
The COVID-19 pandemic required medical guidelines and searches for new treatments to be developed in record time. Medical practitioners had to read the literature while non-stop working in the hospitals and had limited time to read literature.
The use of interactive machine learning like active learning is ready to skyrocket in the upcoming years. It is crucial to ensure that interactive machine learning approaches are fully transparent and explainable. In the forthcoming period, we will show that this is possible to apply interactive machine learning in a responsible way in other applications like legal documents and court verdicts.