As of July 2023, GISAID contains 15.7 million sequences, while 7.7 million have been deposited in GenBank. Public repositories, such as the Global Initiative on Sharing Avian Influenza Data (GISAID) 1 and NCBI’s GenBank 2 host millions of SARS-CoV-2 sequence records. Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the virus, study its transmission, and examine molecular evolution. To overcome these challenges, this study proposes the use of an automated classifier to identify relevant articles. Traditional search strategies based on keywords may miss relevant articles. Moreover, there is limited linkage between published articles and sequence repositories, hindering the identification of relevant studies. Therefore, it is essential to assess the extent and quality of patient-related metadata reported in SARS-CoV-2 sequencing studies. Additionally, the extent to which patient-related metadata is reported in published sequencing studies remains largely unexplored. While these repositories include some patient-related information, such as the location of the infected host, the granularity of this data and the inclusion of demographic and clinical details are inconsistent. However, genomic epidemiology seeks to go beyond phylogenetic analysis by linking genetic information to patient demographics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact. This has been facilitated by the availability of publicly accessible databases, GISAID and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. Since the onset of the COVID-19 pandemic, there has been an unprecedented effort in genomic epidemiology to sequence the SARS-CoV-2 virus and examine its molecular evolution.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |