Single isolate, single CGE services

In this exercise we will determine the species, Multilocus Sequence Type, antibiotic resistance profile, and plasmid content of four isolates from hospital acquired infections.

Case description

During the same week, a number of patients - all admitted to the same ward of a Danish Hospital - acquires a urinary tract infection. The infection is very difficult to treat and the first types of prescribed antibiotics have no effect. Meanwhile, more and more patients get infected...

The bacteria causing the infection is isolated from three patients. DNA is purified and sequenced using the Illumina MiSeq sequencing platform generating paired-end reads. The three sets of paired-end reads are each assembled to draft genomes, which are the starting point for this exercise.

Note: It is out of practical considerations that we work with assembled genomes rather than short sequence reads. The CGE web-services also accept short sequence reads as input, but uploading takes longer and the assembly step also takes some time, which is why we will work with pre-assembled genomes during this workshop.


The draft genomes you will need for completing this exercise can be found on the USB stick handed out to you. These are the files you will need:




Exercise1/Unknown_draft_genome.fsa (for use towards the end of the exercise)

Take a look at the content of the files by opening them in a text editor. We recommend Sublime Text, which is an excellent editor for both Windows and Mac. Notice that the files contain contigs in FASTA format.


Determine the species of the isolates using KmerFinder (https://cge.cbs.dtu.dk/services/KmerFinder/). Note the links to "Instructions" and "Output", which may guide you in regards to the submission step and for interpreting the results. You may keep the default settings, "winner takes it all", as scoring method and "bacteria organisms" as the reference database. 

Note: Due to heavy use of the CGE services, you might experience that your jobs are put in a queue rather than being processed right away. Depending on the number of concurrent users, it may take from app. 5 minutes - several hours for the jobs to finish. Since we do not have time to wait for hours, enter your email address in the box that appears, once you have submitted your job. You will then receive an email with a link to the result page, when the job has finished. Meanwhile, go to this page with links to all results, which have been run in advance.

1. Which species does KmerFinder predict?

Next, determine the antibiotic resistance gene profiles of the isolates using ResFinder (http://cge.cbs.dtu.dk/services/ResFinder/). Again, note links to "Instructions" and "Output". For this exercise, we will only look for "Acquired antimicrobial resistance genes" and not "Chromosomal point mutations", as the latter is not available for this species.

Tick the box next to "Acquired antimicrobial resistance genes". You can choose to only look for particular types of acquired resistance genes under "Select Antimicrobial configuration" or look for all types. By default, all types are selected. You can choose different thresholds for how identical the identified sequence in the input genome has to be to a gene in the ResFinder database for it to be reported (Select threshold for %ID), and how much of a gene in the ResFinder database that has to be covered for it to be reported (Select minimum length). For this exercise, it is recommended that you keep the default settings.

2. Which resistance genes are reported? You might want to fill out this table for getting an overview PDF or DOCX (printed version can be found after the printed instructions for Exercise 1).

3. Find examples of genes in the input sequence which are 100% identical to a gene in the ResFinder database and genes, which are less than 100% identical.

4. Do any of the genes found in the input genomes only partially cover the corresponding gene in the ResFinder database?

Please note that there is a bug related to the alignments (seen by clicking "extended output") when viewing them on the webpage. This is obvious when looking at the alignment for catB4. Although the alignment (HSP) is listed as much short than the length of the database gene (Length), the alignment shows them as the same length. To see the true alignment, download the file "results.txt" by clicking on the button labelled "Results as text", which can be found just above the "extended output" button. 

5. Which resistance genes differ between the three K. pneumoniae isolates?

Examine the presence of plasmids in the three isolates using PlasmidFinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/). Select the database "Enterobacteriaceae". You may keep the default settings for %ID threshold and minimum length.

6. Looking at the results from PlasmidFinder, can you explain some of the differences in resistance genes among the three isolates?

Try running the three isolates through KmerResistance (https://cge.cbs.dtu.dk/services/KmerResistance/), a more sensitive method than ResFinder for identification of acquired antibiotic resistance genes. 

7. Did the results from KmerResistance give you any new information with regards to the aac(3)-IIa gene?

Please note, the optimal input for KmerResistance is raw reads, in which case the reads will be mapped to the database genes. If raw reads are uploaded, KmerResistance also outputs informations on depth of coverage. ResFinder is BLAST-based and works on draft genomes. The results will thus be identical regardless of wether raw reads or an assembled draft genome is uploaded. If raw reads are uploaded, a draft genome will initially be assembled using Velvet, after which BLAST will be performed.

8. Have a closer look at the alignment for blaLen12 in Kleb10. Does it help to explain why this isolate is the only one reported to harbour this gene?

Tomorrow, we will use high-resolution methods to generate phylogenetic trees. Today, we will settle for examining the Multilocus Sequence Type (ST) to determine whether the different isolates are likely to be part of an outbreak.

Determine the ST of the three isolates using the MLST tool  (https://cge.cbs.dtu.dk//services/MLST/). Remember to select the appropriate MLST scheme according to the species.

9. What is the ST of the three isolates? Is there anything you need to be concerned about, e.g. loci that are not 100% identical to the MLST alleles in the database, or maybe even partial loci?


Back at the hospital major cleaning and disinfection efforts are initiated in an attempt to stop the outbreak. However, the following week another (newly-admitted) patient displays symptoms of urinary tract infection. The DNA of the bacteria of this patient is also sequenced and the reads assembled to a draft genome (the file Unknown_draft_genome.fsa).

11. Is the new case likely to be part of the previous outbreak?

If you have time to spare before the wrap-up, you might want to start on Exercise 2. The first part is about using ResFinder-3.0 for identifying point mutations in chromosomal genes causing antibiotic resistance.