VTEC diagnostics

The purpose of this exercise is to identify the serotype and virulence factors of the E. coli strains that we also worked with yesterday. Further, it is to determine which pathovar they belong to.


All groups will be working with the E. coli data as described in Exercise2. If you are in the MRSA1 group, you should work with the Ecoli1 data, if you are in the MRSA2 group, you should work with the Ecoli2 data.


Predict the serotype of your isolates using SerotypeFinder. Leave the settings for %ID threshold and minimum length as they are. Note the links to Instructions on how to use the service, and Output on how to interpret the results.

Again, if you don't want to wait for the results to finish, take a look at the page with links to all results.

Does the identifed serotype correspond to the serotype according to Grad et al 2012? If not, what could be the explanation?

Identify the virulence genes of your isolates using VirulenceFinder. Leave the settings for %ID threshold and minimum length as they are. Again, note the links to Instructions on how to use the service, and Output on how to interpret the results.

Does the identifed virulence genes correspond to the ones mentioned in Grad et al 2012? You might want to fill in a table for better overview: PDF or XLSX

E. coli strains are categorized as different pathotypes/pathovars based on the presence of specific virulence genes/markers. A simplified categorization could be:

Non-pathogenic: Few or no obvious virulence factors.

UPEC: Adhesion factors needed to avoid being flushed away from the urinary tract. Also siderophore proteins such as the one encoded by the iroN gene for iron chelation in urine can be relevant due to the iron-limiting environment in the bladder. Presence of Pap (P) fimbriae (papG adhesion) can be a sign of increased virulence, as these fimbriae are associated with progression of a urinary tract infection into pyelonephritis.

VTEC, STEC (EHEC): A hall-mark virulence factor is the Shiga toxin. This is encoded by the stx1 and stx2 genes. Especially the variants stx2a and stx2c has been associated with serious illness.

EAEC: Many different virulence factors (especially aggregative adherence fimbriae (AFFs) located on a 100-kb pAA plasmid, mycolycins such as those encoded by the pic gene and toxins such as those encoded by the pet and the astA gene) are believed to be responsible for the EAEC phenotype. However, it has recently been found that the regulator encoded by the aggR gene also located on the pAA plasmid is coordinating the virulence factors. Therefore, detection of the aggR gene is a good marker for EAEC.

ETEC: This pathotype is known for its production of Heat-Stabile Toxin (ST) or Heat-Labile Toxin (LT). The former can be encoded by the sta1 or stb genes and the latter by the elt or ltcA genes.

To examine which virulence genes are part of the VirulenceFinder database (or one of the other databases used by the CGE methods), you have to be able to navigate a BitBucket repository, which is where all CGE databases and software is stored. Click this link to learn more about BitBucket repositories PDF. The VirulenceFinder DB is stored here: https://bitbucket.org/genomicepidemiology/virulencefinder_db/.

Examining the VirulenceFinder database, you will see that one of the virulence factors mentioned above (papG) is not part of the database. Further, let's assume you have a theory that presence of the dsbD gene will make a strain more virulent. The dsbD gene is also not part of the VirulenceFinder database. As we cannot wait for the curator to update the database, you will need to construct your own database containing these genes. You can find the papG sequence here: papG.txt. And the dsbD sequence here: dsbD.txt.

Your initial task is to convert the papG and dsbD DNA sequences into a FASTA database readable by MyDBFinder. All you need to do is to copy the two sequences into a text file (e.g. using Sublime Text 2), include FASTA headers above each of the sequences (e.g.: ">papG" and ">dsbD"), and save the text file to a location, where you can find it again.

When you have prepared your database, run an isolate known to carry two papG genes and one dsbD gene through MyDBFinder to verify that the database has been constructed correctly. The isolate that you should use as a positive control can be downloaded here: Ec4. The database with the papG and dsbD genes should be uploaded under "Upload user database (DNA sequences in FASTA format)". Leave settings as they are.

Once you have confirmed that your database works properly (identifies two papG genes and one dsbD gene in the Ec4 isolate), run the five E. coli isolates through MyDBFinder in a similar way.

Based on your results from VirulenceFinder and MyDBFinder, which pathovar do you suggest the five E. coli isolates belong to? Were you able to find the dbsD gene in the isolates?