Jump directly to main navigation Jump directly to content

Institute of Diagnostic Virology (IVD)

Laboratory for Biocuration

Today the scientific community produces more data than ever. New methods and technological advances in computing technology allows to store and share huge amounts of data. However, precise analysis of this data not only requires advanced interpretation methods; ensuring data quality is also an essential prerequisite for reliable analysis results. 

The Laboratory for Data Curation is dedicated to data curation and evaluation and uses this data for practice-oriented analyses. A central area of activity is the creation and analysis of genome information of viral pathogens, the evaluation of data quality and the enrichment of data with functional metadata. These data sets are used to characterize the causative pathogens and to classify new variants. In combination with the relevant metadata, the genome sequences are used for temporal and spatial tracking of outbreak events and the prediction of correlations between individual outbreaks.


Molecular and phylogenetic analyses of influenza viruses

In collaboration with the WOAH, FAO and National Reference Laboratory for Avian Influenza at FLI, methods of molecular epidemiology are applied to outbreaks of Highly Pathogenic Avian Influenza. Virus genome information can be used to track outbreaks effectively on molecular level. Together with metadata, the genome sequence is used to track outbreak events with temporal and spatial resolution and to predict relationships between individual outbreaks. At the same time, the pathogens can be examined for mutations and possible zoonotic potential. 

In this project, extensive phylogenetic analyses of various avian and porcine influenza viruses are done. The project is of major importance because avian influenza outbreaks in particular threaten wild bird populations and cause major economic losses in the poultry industry. The aim is to determine the origin of these viruses and possible precursors. In addition, new reassortants will be identified that emerge through the exchange of whole segments and genotypes will be classified and named. Moreover, spatiotemporal analyses will be performed to track the spread of influenza viruses and clarify possible relationships between outbreaks. By examining the geographic and temporal patterns of avian influenza outbreaks, we gain a deeper understanding of their transmission dynamics and identify potential factors contributing to their spread. Genome sequences will be screened for mutations that may indicate changes in key traits such as virulence, receptor binding, or adaptations to different host species. This will allow to assess the zoonotic potential of these influenza viruses, i.e., their ability to cross species barriers and thus pose a potential threat to human health.

The project includes the following tasks:

  • Analysis of sequences from avian and porcine influenza viruses and influenza viruses from other animal hosts.
  • Phylogenetic analysis and determination of evolutionary origin and potential ancestral viruses.
  • Determination of novel reassortants and classification and designation of genotypes.
  • Spatio-temporal analyses to determine the spread of influenza viruses and determine links between outbreaks.
  • Screening for mutations indicative of changes in characteristics such as virulence, receptor binding, or adaptations to other host species.
  • Evaluation of zoonotic potential.

Data curation of records from influenza databases

Within this project comprehensive screenings of various public databases are undertaken in order to identify influenza-specific records. This involved a meticulous examination of the publicly available sequence databases, with the aim of identifying context-relevant non-redundant records from the summary retrievals. To ensure the quality and reliability of the data a thorough quality check and enrichment process is performed. This involved verifying the plausibility and completeness of the records, as well as enriching them with relevant metadata to enhance the value and usefulness of the data for further analysis and purposes. Once the data had undergone the necessary quality checks and enrichment, the records will be preserved within our internal structures. This step is crucial in ensuring the long-term availability and accessibility of the data for future reference. Additionally, we internally disseminate the enriched data sets to facilitate collaboration and knowledge sharing within our organization.

The project includes the following tasks:

  • Screening of various public databases for influenza-specific records.
  • Identification of context-relevant non-redundant records from summary retrievals from the publicly available sequence databases.
  • Quality check and enrichment of records with relevant metadata.
  • Data preservation within the internal organizational unit and internal dissemination of the enriched data sets.

Data analysis of Next Generation Sequencing (NGS) data for molecular characterization of emerging influenza viruses

This project serves for the analysis of next-generation sequence data. It involves several key components, including the analysis of raw data, the derivation of consensus sequences, and the annotation of these sequences. In addition, minor variants and structural variations in the sequences are determined. Sequence data are submitted to repositories at the end of the project and made publicly available. This step ensures that the data will be available for future reference and analysis. In addition, new sequencing methods, platforms, and commercial systems for sample handling and preparation are continuously evaluated. The evaluation will allow us to determine their effectiveness and potential suitability for our tasks. This project is instrumental in detecting and evaluating emerging Influenza virus strains based on their genome sequences.

The project includes the following tasks: 

  • Analysis of next high-throughput sequencing data, Analysis of real-time Sequencing with nanopores, raw data analysis, deduction of consensus sequences and annotation. Determination of minor variants and structural variations.
  • Submission of sequence data to repositories.
  • Evaluation of new sequencing methods, platforms, and commercial systems for sample handling and preparation.

Research Projects (Externally funded)

We are participating in the following externally funded research projects: 


  • PREPMEDVET: Response to animal diseases and bioterrorism: Preparedness and Response in an Emergency context to Pathogens of MEDical and VETerinary importance
    The PREPMEDVET project is funded by the BMBF under the framework program "Civil Security - Preparedness and Rapid Response to Biological threats".
  • VEO: Versatile Emerging Infectious Disease Observatory
    The VEO project is funded by the EU under the Horizon 2020 program. 
  • KAPPA-FLU: Ecology and biology of HPAIV H5
    The VEO project is funded by the EU under the Horizon Europe program. 
  • COMPARE: COllaborative Management Platform for detection and Analyses
    of (Re-) emerging and foodborne outbreaks in Europe. The VEO project was funded by the EU under the Horizon 2020 program. 
  • DetektiVir: Ad-hoc-de-novo-detection of viral pathogens with adaptive diagnostics for the prevention of epidemics.
    The PREPMEDVET project was funded by the BMBF under the framework program "Civil Security ".