Date published:

An algorithm that brings hope

Professors: Andrzej Zieleziński i Jakub Barylski in their laboratory, working on the project

Our scientists have developed an algorithm that analyses viruses more quickly than supercomputers can. Researchers from AMU developed a tool that radically accelerates the classification of viruses and the analysis of genetic data. The Vclust algorithm completes four years of work in just four hours. An interview with AMU Professors Andrzej Zieleziński and Jakub Barylski about the algorithmic revolution.

What was the basis for your program? What made you decide to work on Vclust? Would it be fair to say that it was “dissatisfaction” With research programs that took too long?


Prof. Andrzej Zieleziński: Yes, you could say that. The starting point was impatience — analyses that took weeks or months effectively blocked research. The second equally important problem was that each existing tool calculated different measures of genome similarity and was recommended for different tasks. For instance, one method was recommended by the International Committee on Taxonomy of Viruses (ICTV) for classifying virus species, while another was recommended for grouping genomes from environmental studies. In practice, this meant that several programs had to be installed and run for different analyses, which was time-consuming and complicated. Vclust was developed to combine these approaches into one consistent tool that is fast, accurate, and universal.

Today, the amount of biological data, especially viral data, is growing exponentially. We discover about a million new viruses every year. Is Vclust the solution to this chaos?

Prof. Jakub Barylski: Indeed, every year, we learn about hundreds of thousands, even millions, of new viral sequences. At first glance, it might seem that each newly discovered genome represents a new virus, but often, these are variants of already-known genomes. This is where Vclust comes in handy. The program compares each new sequence with a large database of previously described virus genomes to show whether we are dealing with a new virus or a variant of one we already know. This allows us to make sense of the flood of data and quickly grasp the actual diversity of viruses.

Read more in University Life