In addition to NGS-services for the sequencing itself, BaseClear also offers bioinformatics services. Our bioinformatics specialists offer you exactly what you need, based on CLCbio software in combination with other state-of-the-art methods..
SSPACE is a stand-alone program for scaffolding pre-assembled contigs using paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (FASTA or FASTQ). The final scaffolds are provided in FASTA format.SSPACE has shown excellent performance on various datasets. The results have been published in the high-impact journal Bioinformatics (2011, vol. 27(4), pag. 578-9), but the method is also frequently cited in other papers.
ALGORITHM SSPACE premium
After the initial release of SSPACE basic, we have been working intensively to improve the algorithmic and speed performance. The latest improvements are always first released in the SSPACE premium version, making it the most advanced scaffolder! At present the basic protocol has been upgraded to version 2.0 (which corresponds to premium version 1.0). The premium protocol has been also upgraded to version 2.0 and includes many improvements!
SSPACE Premium v1.0 was our first premium release, and included the following advantages compared to the basic protocol:
- Pre-filtering step to remove linkages with repetitive contigs. This reduces mistakes in placing repeats within scaffolds.
- Modification of the linkage-ratio in order to take into account the contig length into account. This was done to normalize for linkage over-estimation. Larger contigs tend to have more links than smaller contigs and this leads to a bias in the linkage-ratio (this improvement is especially an advantage for matepairs).
- The alignment with Bowtie (Langmead et al., 2009) was modified to allow multithreaded analysis.
- The user can specify to perform a Bowtie gapped-alignment
- Inclusion of additional orientations of the paired-reads, including -> <-, <- ->, <- <- and ->-> and orientations. Users do not have to convert the read-direction of the input files.
- A script was included to convert .sam or .bam files to .tab files.
- A customized input option was added to allow for tab-delimited file format (containing the positions for the paired-read on the contigs).
In SSPACE Premium v2.0 we have further improved the protocol, and this version includes the following improvements:
- In addition to Bowtie, the user can now also choose the alignment program BWA (Li and Durbin, 2010). We have included both the standard BWA protocol for short reads and the BWA-SW protocol which is especially suited to scaffold with long reads (e.g. Roche 454 or third-generation sequencers).
- Consequently reads containing undefined nucleotides (N’s) are not automatically removed since BWA and BWA-SW can handle these reads. As regards Bowtie, reads with N’s are simply skipped.
- SSPACE is now able to read in gzipped files (*.gz) directly.
- The mapping of the reads against the contigs is now done in a multithreaded manner, thus increasing the overall speed of the process.
- Output scaffolds are now represented in a more elegant format (each line consists of 60 bp instead of one consecutive line per scaffold).
- Contig extension is performed using k-mer overlap: reads that can possibly used to extension are split into smaller k-mer fragments. This gives a more reliable result to the previous version which used full reads to extend contigs.
In change for obtaining the SSPACE premium version we ask you to donate a contribution of
€ 1500,-. These contributions can allow us to further work on our programs that (hopefully) serve a larger NGS community! Part of the donation is also used to guarantee that users receive good support from our bioinformaticians.