Aggregated Gut Viral Catalogue (AVrC)
dc.contributor.affiliation | University of Helsinki-Galperina, Anastasia | |
dc.contributor.author | Galperina, Anastasia | |
dc.date.accessioned | 2025-04-29T13:59:55Z | |
dc.date.issued | 2024-06-02 | |
dc.date.issued | 2024-06-02 | |
dc.description | Despite the importance of the gut virome in human health and disease, identifying viral sequences from metagenomic datasets remains computationally challenging. Up to 99% of viral reads lack significant alignments to known viral genomes due to underrepresentation in reference databases. Recent machine learning tools can detect novel viral sequences based on features like k-mer composition or genomic signatures, but are limited to classifying assembled contigs into simplistic viral/non-viral categories. Several large-scale efforts have mined human gut metagenomes to establish viral catalogues, including the Gut Virome Database (33,242 viral OTUs), Cenote Human Virome Database (45,033 OTUs), and Gut Phage Database (142,809 OTUs). However, these catalogues have not been consistently compared for quality, diversity, and completeness. There is an unmet need to harmonize available gut viral sequences into a unified resource for comparing novel viruses against previous efforts. The Aggregated Gut Viral Catalog (AVrC) addresses this gap by harmonizing and aggregating previous mining efforts into a comprehensive resource to allow for the exploration of the Human gut viral diversity and the easier comparison of newly discovered viral sequences. | |
dc.identifier | https://doi.org/10.5281/zenodo.11426065 | |
dc.identifier.uri | https://datakatalogi.helsinki.fi/handle/123456789/4548 | |
dc.rights.license | cc-by-4.0 | |
dc.subject | virome | |
dc.subject | gut microbiome | |
dc.title | Aggregated Gut Viral Catalogue (AVrC) | |
dc.type | dataset |