Single isolate, single CGE services

In this exercise we will determine the species, Multilocus Sequence Type, antibiotic resistance profile, and plasmid content of four isolates from hospital acquired infections.

Case description

During the same week, a number of patients - all admitted to the same ward of a Danish Hospital - acquires a urinary tract infection. The infection is very difficult to treat and the first types of prescribed antibiotics have no effect. Meanwhile, more and more patients get infected...

The bacteria causing the infection is isolated from three patients. Using traditional phenotypic tests, the species is determined to be Klebsiella pneumoniae. DNA is purified and sequenced using the Illumina MiSeq sequencing platform. Paired-end reads are generated. The three sets of paired-end reads are each assembled generating three draft genomes.


The draft genomes needed for completing this exercise is available on the USB stick handed out to you. You will need four files:




Day_1_and_2/Klebsiella/Unknown_draft_genome.fsa (for use at the end of the exercise)

You can take a look at the content of the files by opening them in a text editor. We recommend Sublime Text 2 (http://www.sublimetext.com/2), which is a good editor for both Windows and Macs. Notice that the files contain contigs in FASTA format.

Note: It is out of practical considerations that we work with assembled genomes rather than short sequence reads. All the CGE web-services also accept short sequence reads as input, but uploading takes longer and the assembly step also takes some time, so we will work with pre-assembled genomes during this workshop.


Confirm that the species is K. pneumoniae using KmerFinder (https://cge.cbs.dtu.dk/services/KmerFinder/) for 1-2 of the assemblies. Note the links to "Instructions" and "Output", which may guide you in regards to the submission step and for interpreting the results. You may keep the default settings - "winner takes it all" as scoring method and "bacteria organisms" as the reference database. 

Note: Due to heavy use of the CGE services, you might experience that your jobs are put in a queue rather than being processed right away. Depending on the number of concurrent users, it may take from app. 5 minutes - several hours for the jobs to finish. Since we do not have time to wait for hours, enter your email address in the box that appears, once you have submitted your job. You will then receive an email with a link to the result page, once it has finished. Meanwhile, go to this page with links to all results, which have been run in advance.

Next, determine the antibiotic resistance gene profiles of the isolates using ResFinder (http://cge.cbs.dtu.dk/services/ResFinder-2.1/). Again, note the links to "Instructions" and "Output". You can choose to only look for particular types of resistance genes under "Select Antimicrobial configuration" or look for all types. By default, all types are selected. You can choose different thresholds for how identical the identified sequence in the input genome has to be to a gene in the ResFinder database for it to be reported (Select threshold for %ID), and how much of a gene in the ResFinder database that has to be covered for it to be reported (Select minimum length). For this exercise, it is recommended that you keep all default settings.

  • Which resistance genes are reported? Are they 100% identical to the genes in the ResFinder database? Do any of the genes found in the input genomes only partially cover the corresponding gene in the ResFinder database? Do the three K. pneumoniae isolates contain the same resistance genes? You might want to fill out this table for getting an overview PDF or DOCX.

Examine the presence of plasmids in the three isolates using PlasmidFinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/). Select the database "Enterobacteriaceae". You may keep the default settings for %ID threshold.

  • Looking at the results from PlasmidFinder, how can the differences in resistance genes among the three isolates be explained?

Tomorrow, we will use high-resolution methods to generate phylogenetic trees. Today, we will settle for examining the Multilocus Sequence Type (ST) to determine whether the different isolates are likely to be part of an outbreak.

Determine the ST of the three isolates using the MLST tool  (https://cge.cbs.dtu.dk//services/MLST/). Remember to select the appropriate MLST scheme according to the species.

  • What is the ST of the three isolates? Is there anything you need to be concerned about, e.g. loci that are not 100% identical to the MLST alleles in the database, or maybe even partial loci?


Back at the hospital major cleaning and disinfection efforts are initiated in an attempt to stop the outbreak. However, the following week another (newly-admitted) patient displays symptoms of urinary tract infection. The DNA of the bacteria of this patient is also sequenced and the reads assembled to a draft genome (the file Unknown_draft_genome.fsa).

  • Is the new case likely to be part of the previous outbreak?