VTEC diagnostics

In this exercise we will work with an isolate from the German E. coli outbreak in 2011. We will identify the serotype and antibiotic resistances caused by chromosomal mutations. Further, we will look for the presence of virulence factors to determine the pathovar (pathotype) of the isolate and speculate why this strain was particularly virulent.


We will be working with the following files, which both contain draft genomes.

Exercise2/escherichia_coli_c227-11.fsa (from the paper by Grad et al 2012)

Exercise2/Ec4.fsa (to be used as a positive control when generating your own database)


Examine the isolate in the file escherichia_coli_c227-11.fsa for the presence of chromosomal point mutations causing antibiotic resistance using ResFinder (https://cge.cbs.dtu.dk/services/ResFinder/). Make sure to select E. coli as the species. Search for "all mutations, known and unknown". 

Again, if you don't want to wait for the results to finish, take a look at the page with links to all results.

1. Did you identify any known chromosomal point mutations and which type of antibiotics do they confer resistance towards?

2. For Unknown Mutations the resistances they confer are not listed. Why might that be?

3. For some of the Unknown Mutations the resulting amino acid change is not listed. Why might that be?

Predict the serotype of your isolates using SerotypeFinder. Leave the settings for %ID threshold and minimum length as they are. 

4. Do you identify the expected serotype? 

E. coli strains are categorised as different pathovars/pathotypes based on the presence of specific virulence genes/markers. A simplified categorisation could be:

Non-pathogenic: Few or no obvious virulence factors.

UPEC: Adhesion factors needed to avoid being flushed away from the urinary tract. Also siderophore proteins such as the one encoded by the iroN gene for iron chelation in urine can be relevant due to the iron-limiting environment in the bladder. Presence of Pap (P) fimbriae (papG adhesion) can be a sign of increased virulence, as these fimbriae are associated with progression of a urinary tract infection into pyelonephritis.

EHEC (VTEC, STEC): A hall-mark virulence factor is the Shiga toxin (Stx). Stx consists of an A subunit (encoded by stxa) and B subunits (encoded by stxb). There are two sub-groups of Stx, Stx1 and Stx2. Stx2 is more prevalent in haemorrhagic colitis and HUS than Stx1.

EAEC: Contains a 100-kb pAA plasmid with many different virulence factors (especially aggregative adherence fimbriae (AFFs), mycolycins such as those encoded by the pic gene, and toxins such as those encoded by the pet and the astA gene). A regulator encoded by the aggR gene, also located on the pAA plasmid, is coordinating the virulence factors. Therefore, detection of aggR is usually a good marker for EAEC.

ETEC: This pathotype is known for its production of Heat-Stabile Toxin (ST) or Heat-Labile Toxin (LT). The former can be encoded by the sta1 or stb genes and the latter by ltcA.

Identify the virulence genes of your isolates using VirulenceFinder. Select "Escherichia coli" as the species. Leave the settings for %ID threshold and minimum length as they are. 

Fill in a table for better overview of the relevant virulence factors: PDF or DOC (printed version can be found after the printed instructions for Exercise 2).

5. Which pathotype(s) do the isolates belong to? Do the results offer a possible reason as to why the strain was so virulent?

You might want to make sure that the genes not found in the isolate are indeed part of the VirulenceFinder database. To determine exactly which virulence genes are part of the VirulenceFinder database (or one of the other databases used by the CGE methods), you have to be able to navigate a BitBucket repository, which is where all CGE databases and software is stored. Click HERE to learn more about BitBucket repositories. The VirulenceFinder DB is stored here: https://bitbucket.org/genomicepidemiology/virulencefinder_db/.

6: Can you identify a virulence gene described above as being important for the classification of pathovars, which is not found in the isolate, and which is actually not in the VirulenceFinder database? Hint: Examine the file virulence_ecoli.fsa using, for instance, Cmd+f if you work on a Mac or Ctrl+f if you work on a Windows computer.

Hopefully you just found that papG is not part of the database. Further, let's assume you have a theory that presence of the dsbD gene will make a strain more virulent. The dsbD gene is also not part of the VirulenceFinder database. As we cannot wait for the curator to update the database, you will need to construct your own database containing these genes. You can find the papG sequence here: papG.txt. And the dsbD sequence here: dsbD.txt.

Your initial task is to convert the papG and dsbD DNA sequences into a FASTA database readable by MyDBFinder. All you need to do is to copy the two sequences into a text file (e.g. using Sublime Text), include FASTA headers above each of the sequences (e.g.: ">papG" and ">dsbD"), and save the text file to a location, where you can find it again.

When you have prepared your database, run an isolate known to carry the papG and dsbD genes through MyDBFinder as a positive control to verify that the database has been constructed correctly. The isolate that you should use as positive control is Ec4. It contains two papG genes and one dsbD gene. The database with the papG and dsbD genes should be uploaded under "Upload user database (DNA sequences in FASTA format)". Leave all settings as they are.

Once you have confirmed that your database works properly (identifies two papG genes and one dsbD gene in the Ec4 isolate), run the E. coli isolate through MyDBFinder in a similar way.

7. Does your E. coli isolate contain papG or dsbD?