We are proud to present a whole-genome schema for multi-locus sequence typing (wgMLST) of Listeria monocytogenes. The schema is available in BioNumerics and in combination with our cloud-based Calculation Engine, typing Listeria monocytogenes isolates up to strain level using whole genome sequencing data is now easily accessible to everyone.
The schema
In collaboration with different organism experts on Listeria, our scientists have taken up the challenge to create a genome-wide schema for multi-locus sequence typing (MLST) of Listeria monocytogenes isolates. Starting from more than 150 publicly available reference sequences and plasmids which capture the known diversity of Listeria monocytogenes, 4797 loci have been defined to comprise the pan-genomic schema, also capturing the accessory loci, and thus increasing the discriminatory power of the schema. The schema was created in a way that on average 97.8% and 95.8% of all loci of the reference genomes and plasmids were recovered within the schema, respectively, resulting in an average genome coverage of 88.1%. Starting from this set of loci, different subsets can be determined such as core loci or the traditional MLST loci. At the same time, the extended schema also allows for the detection of subtype- or outbreak-specific markers, thus enabling more powerful classification and outbreak definition tools. Starting from the annotated reference genomes, an in-house developed schema creation procedure uses a sampling-based multi-reciprocal BLAST procedure to determine those sets of alleles that make up the stable loci in the accessory genome. A per-locus allele assessment procedure then determines the central prototype allele, and thus the definition of the locus. The final schema contains both core loci and accessory loci. In addition, classical MLST loci are added to the schema to obtain maximal consistency with classical and novel multi-locus sequence typing initiatives for Listeria monocytogenes.
The samples
Using BioNumerics and our high-throughput calculation infrastructure, analyzing whole genome sequencing data for Listeria monocytogenes is now within everyone’s reach. The Cloud Calculation Engine, managed by us and accessible through BioNumerics, offers a high-throughput environment for all your sample processing needs. Its quality-controlled de novo assembly possibilities allow you to easily assemble whole-genome sequencing data without the need of local computing power. The two allele detection procedures — assembly-based and assembly-free — allow you to do fast and reliable allele detection, and overcome the issues caused by the use of draft assemblies. With turnaround times below 20 minutes for a sample, and the ability to process many samples simultaneously, high-performance computing is brought to your desktop by a few simple clicks.
The results
The whole-genome multi-locus sequence typing schema for Listeria monocytogenes has been evaluated and tested by our biologists and collaborators1, and is also being used in outbreak surveillance2. Great care has been taken to create an analysis procedure that minimizes sample artifacts, while maintaining an enormous discriminatory power that supersedes the core genome schema or other current typing standards. The schema has been applied to real-life outbreak data, as illustrated in our application note on wgMLST Listeria.
Try it on your own data now!
To start using this wgMLST approach for typing of Listeria monocytogenes, simply request a Calculation Engine project. For an easy introduction, we have a Listeria monocytogenes wgMLST tutorial available online. We look forward to your discoveries!
Acknowledgements
Part of this work has been done in the framework of the Patho-NGen-Trace project. Patho-NGen-Trace is funded by the EC under the 7th Framework Programme of the European Union.
References
1: Sylvain Brisse, et al. (Publication in preparation).