Database development for the analysis of diversity and biogeography of symbionts.

Postdoctoral position. Supervisors: Jörg Peplies (Ribocon) and Andrés Moya (UniValencia) (with input from Michael Richter, MPIMM and Carlos Llorens, Biotechvana). Host: Ribocon. Secondment internship:to UniValencia and Biotechvana


The investigation of symbiont biodiversity and biogeography using molecular marker genes is essential for the analysis of their structure, composition and host interactions. The availability of classical clone based sequence analysis as well as high throughput tag sequencing provides easy access to hundreds or thousands of tags for marker genes such as the small- and large subunit ribosomal RNA. The bottleneck is the bioinformatic analysis of the flood of data in terms of data processing, quality assessment and phylogenetic classification. Specialized knowledge databases for rRNAs such as the RDP II or SILVA project exist to support a general (higher level) classification of rRNA sequences. However, these general purpose databases lack highly resolved classifications of sequences from pro- and eukaryotic symbiotic communities which often appear as clusters in phylogenetic trees.


This research task will focus on the adaptation, optimization and extension of the SILVA rRNA databases ( to investigate the phylogenetic position of rRNA sequences from symbiotic communities. This project will provide a standardized, integrated and curated dataset both for the Symbiomics network and the research community at large for investigating the biogeography and co-evolution of symbiotic organisms and their hosts.

Key methods

This is a bioinformatic research task that requires an experienced researcher with expertise in DNA sequence analysis as well as in ribosomal RNA taxonomy and phylogeny, including the corresponding computational methods. By gaining insights into Europe’s largest database for ribosomal RNA, the researcher can extend their expertise on data processing, integration, standardization and exploitation as well as learning new computational techniques. Furthermore, by working in a SME, the ER will gain the expertise and soft skills needed for entrepreneurship and market-focused project management. The task will deliver specialized databases including manually-curated alignments and trees for detailed phylogenetic analyses of symbiotic communities. Besides the primary sequence information, special emphasis will be put on the availability and correctness of contextual (meta)data provided for each sequence in the public domain, as well as by the project partners. The contextual data will be evaluated with respect to the emerging ‘minimum information standards’ of the Genomic Standards Consortium.