Genome assembly and annotation

For genome analysis projects our standard bioinformatics services include genome assembly and functional annotation. Also custom analyses are routinely carried out by our highly-skilled bioinformatics team. Analyses are commonly offered in combination with our next-generation sequencing services, although we are happy to assist you with your bioinformatics-only projects. Our main drive is to make sure your research questions are answered in the best possible manner. We aim to present the results in a transparent way and are always open to discuss in-depth the outcomes together with our clients!

Genome assembly

Our bioinformatics department has developed state-of-the-art assembly pipelines which follow either a reference-based or a de novo approach (and in some cases a combination of both). The ultimate goal is to provide our customers with a high-quality finished genome sequence. To accomplish this, we offer different sequencing technologies, including those of Illumina, Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). The resulting reads are optimally merged into finished genomes using tools that are recognized in the field for their outstanding quality. Among these are the highly cited packages such as SPAdes (Bankevich et al., 2012) for draft assembly of short reads and ABySS (Jackman et al., 2017). for long-read assembly. Our internally developed packages SSPACE (Boetzer et al., 2011) and GapFiller (Boetzer and Pirovano, 2012) are key tools to finish genomes in a very accurate manner. The resulting assemblies are subject to an extensive quality assessment procedure, which includes error correction and structural improvement of the contigs. Manual interpretation and genome closure are also offered. In this way we can guarantee that our customers receive  exceptionally accurate and complete assemblies at a very sharp price!

Delivered output

  • Assembled contig and scaffold sequences in FastA format.
  • Accession Golden Path (AGP) file describing the linkage between contigs in the scaffolds.
  • De Novo assembly report containing a summary of the assembly results and quality statistics.

Genome annotation

Although an assembly can provide insight into the general architecture of a genome, gene annotations are the key to elucidate the functionalities of microbial organisms. Annotations at BaseClear are defined in two steps, which are the structural and functional annotation. A structural annotation implies the prediction of open read frames (ORFs) and – in the case of eukaryotes – the prediction of the correct intron-exon structure. For bacteria we use the Prodigal software (Hyatt et. al, 2010) to find bacterial and archaeal genes, whereas for eukaryotes (mainly yeast and fungi) we use the Augustus software (Stanke et. al, 2003) to predict genes but also the correct model. In the latter case RNA-Seq expression data can be added to enhance the determination of alternative transcripts/splicing.

Subsequently we assign functional annotations to genes, tRNAs and rRNAs, but also predict i.e. Signal peptides. Our service includes an extensive search of the most commonly used functional databases among which SwissProt/UniProt, EC-enzyme, GeneOntology, CAZy and Pfam. Also KEGG identifiers are returned for each predicted gene. The standard output includes GenBank and GFF files which are perfect starting points with the NCBI genome submission process. Nonetheless other formats can be easily provided upon request. The output files can be easily imported in any third-party genome browser, but we highly recommend our customers apply for a trial license for the Genome Explorer (https://genome-explorer.com), which is our tool of choice for (advanced) genome mining and comparative genomics.

Delivered output

  • Table containing full annotation for predicted coding sequence regions.
  • GenBank and GFF annotation formats.
  • Extended annotation report containing a summary of the assembly results and quality measures.
  • Access BaseClear Genome Explorer which allows interactive analyses of the results
Genome annotation

Convinced? Get in touch

Get a quoteMeet baseClearContact form
Get in touch