Chapter 3.7. Multiple-locus variable-number tandem repeat analysis (MLVA) for bacterial typing


Chan Shiao Ee, Asma Ismail, Kirnpal Kaur Banga Singh

Art work
Uli Reinhardt
Safe water, sanitation and hygiene at home should 
not be a privilege of only those who are rich or live in urban centres,
Tedros Adhanom Ghebreyesus, WHO Director-General
But I, the one who hugs you, 
I am not alone!
Gabriela Mistral
I Am Not Alone 


The advent of whole genome sequences for multiple isolates from a single species has opened the gateway for an ultimate epidemiology typing platform that is capable of providing the entire genetic blueprint of a pathogen; the genetic variations of the pathogen can be identified, which is useful for elucidating the relatedness of isolates [1]. Analysis of sequenced genomes has revealed an abundance of repetitive DNA motifs, which are either located at one genomic area or distributed throughout the entire genome [2,3]. DNA tandem repeats (TRs) are inter- or intra-genic nucleotide sequences that are direct head-to-tail repeats, and can be categorized according to their repeat unit size [2,3]. DNA repeat motifs can be defined as microsatellite, minisatellite, and macrosatellite for repeat unit sizes varying from 1 to 9, 10 to 100, and more than 100 base pairs (bps), respectively. Moreover, TRs are also classified into identical TRs and degenerated TRs based on the conservation of the repeat nucleotide sequence [3]. Degenerated TRs are repetitive DNA sequences that have variations arising from point mutations. In contrast, homogenous repetitive DNA sequences are deemed identical TRs.

Genetic events, such as DNA replication slippage and recombination, could generate TR variability, as observed among different strains of the same species [2,4]. Regions containing TRs are potentially hypermutable owing to DNA replication errors such as base pair substitutions, insertions, deletions or mutations in bacteria [5]. Variable repeats abrogate or stimulate gene expression in bacterial lipooligosaccharide biosynthesis and the regulation of virulence genes, thereby affecting genome function in microorganisms [6]. Polymorphisms found at the loci of repetitive DNA that may be variable among strains in respect of the number of repeat units are called variable-number tandem repeats (VNTRs) [6].

Multiple-locus variable-number tandem repeat analysis (MLVA) determines differences in the number of repeat units at multiple VNTR loci in the microbial genome. These differences can be investigated by polymerase chain reaction (PCR) amplification of tandem repeats and boarding consensus regions, followed by amplicon separation and length measurement. The number of repeat units can be deduced from the measured amplicon size. Subsequently, genotypes can be assigned according to the gain or loss of discrete repeats, which leads to better insight into the genetic relationships between bacterial strains [7,8].

MLVA has emerged as a sequence-based molecular typing tool that was developed to investigate a multitude of bacterial species, such as Acinetobacter baumannii [9], Bacillus anthracis [10], Enterococcus faecium [8], Escherichia coli [7], Listeria monocytogenes [11], Methicillin-resistant Staphylococcus aureus [12], Neisseria meningitides [13], Shigella species [14], and Salmonella enterica serovar Typhimurium [13]. The schemes as published have been applied to the epidemiological analysis and surveillance of pathogens. Initially, MLVA was developed using agarose gel electrophoresis as an approach to estimate the number of repeat units [14]. More recently, sophisticated technologies such as capillary electrophoresis sequencing and lab-on-a-chip systems have been used to develop and improve MLVA to create a rapid, simple, low-cost, and high-throughput genotyping platform [11,15]

Pulsed-field gel electrophoresis (PFGE) is an electrophoretic technique used to separate large DNA molecules (10 kb to 10 Mb). The procedures involve preparation of bacterial cell plugs, macro-digestion with a rare cutter restriction enzyme, and separation of the digested fragments by PFGE. It has played an important role in molecular epidemiology and is considered the gold standard in many epidemiological studies of bacterial pathogens that cause infectious diseases, especially in the investigation of foodborne outbreaks [16,17].

A gold standard microbial typing technique is characterized by optimal typeability, high reproducibility, adequate stability, unprecedented resolving power, and ease of performance and accessibility [1]. Moreover, an assessment of method feasibility should consider speed, throughput, cost, ease of use, objectivity, versatility, and portability, and should also emphasize the necessity of a successful subtyping method that can be used for inter-laboratory surveillance [18] However, no single typing method excelled in all the criteria described above.

PFGE is the most widely used typing method. It is well known for its high discriminatory power, typeability, reproducibility, and versatility, but has medium robustness and low portability, objectivity, and throughput [1,19]. Comparatively, MLVA performs well as an overall method for subtyping bacteria. It scores well in discriminatory power, robustness, portability, objectivity, and throughput, but scores poorly in versatility because the developed protocols are species-specific [18,19]. MLVA is therefore either used as a complementary tool to PFGE or an alternative for it in outbreak investigations and molecular surveillance [11,20]

A rapid, inexpensive typing tool that is simple to use and can provide good inter-laboratory reproducibility with high discriminatory power and meaningful epidemiological inference, has obvious implications for effective patient management and the implementation of infection control. MLVA is an ideal DNA fingerprinting method because it is reproducible, simple to use, produces rapid results, and has strong result storing and sharing capabilities compared with other epidemiological typing methods. Furthermore, the MLVA assay does not require a cold chain and is ready to use, which makes bacterial typing more accessible, especially in remote or resource limited areas. There have been several reports on the development of portable PCR systems that can be used in the field with a battery-operated portable analyzer. These technologies would provide further advantages for the future use of MLVA as a practical typing scheme, especially in resource limited laboratories, because no expensive instruments are required, the turnaround time for the results is rapid, and the cost per bacterial strain typing is low.

Figure 1. Schematic representation of different types of DNA tandem repeats. (1) Definition of DNA tandem repeats based on the length of the repeat unit. (2) Type of repeat unit sequences based on conservation of sequence.

Workflow of MLVA

An MLVA assay is based on the detection of variable tandem repeat units of a predefined set of VNTR loci in microbial genomes. Generally, VNTR-based assays involve DNA sample preparation, followed by VNTR amplification and repeat units estimation, and subsequent data analysis and storage. Thermolysates or pure DNA extracted using commercially available DNA extraction kits can be used for the experiment. These DNA samples are then subjected to PCR amplification with specific primers designed at the flanking regions of the VNTRs. The length of the amplicons is measured by gel electrophoresis and capillary electrophoresis. The amplicon sizes can be estimated from a captured gel image using the dedicated software, which is usually provided with the image documentation system. Capillary electrophoresis is another approach for separating and assigning sizes to various VNTR loci. The sizes of amplified VNTRs can be measured individually or a combination of VNTRs can be investigated in a single run of capillary electrophoresis using multi-fluorescent labels on VNTR loci. The results obtained from both approaches are imported into sophisticated software for the assignment of allele numbers. A string of alleles is created from the number of repeat units for each locus forming the MLVA profile, and is eventually used to assign an MLVA type. The analyzed integer-based results with MLVA codes can be stored in a web-based electronic database, and data can be exchanged between laboratories.

Figure 2. An overview of multiple-locus variable-number tandem repeat analysis (MLVA) workflow in subtyping a pathogen.

Procedures involved in developing an MLVA assay

In silico identification of tandem repeats (TRs)

The availability of complete annotated genome sequences in nucleotide databases enables the location and identification of tandem repeat loci by querying the sequences using sophisticated software. Free online resources including the Tandem Repeats Database [21]and the Microbial Tandem Repeats Database [22] have greatly helped in developing and performing tandem repeats-based genotyping.

The Tandem Repeats Database is a public repository of information on tandem repeats, and contains a variety of tools for repeat analysis, such as the Tandem Repeats Finder program, query and filter capabilities, repeat clustering, polymorphism prediction, PCR primer selection, and data visualization. The Microbial Tandem Repeats Database comprises four sections with different functionalities: the Tandem Repeats Database enables tandem repeat identification across the queried genome sequence; the Strain Comparison Page identifies and compares tandem repeats between different genome sequences from the same species; the Blast in the Tandem Repeats Database facilitates the search for a known tandem repeat and prediction of amplification product sizes, as well as verification of primer specificity; and the Bacterial Genotyping Page is a service for strain genotype comparisons where users can compare a produced genotype against existing genotype data available in the database. However, these programs basically generate an output comprising information such as repeat position, repeat unit length, number of repeat units, and nucleotide composition.

Selection of potential variable-number tandem repeat (VNTR) loci

Short microsatellite-type repeats are hypervariable in copy numbers, and therefore provide clear, epidemiologically informative DNA polymorphisms [4,23]. However, repeat units of shorter than 5 bp should not be selected owing to the limitations of reproducible size determination by either gel-based or capillary electrophoresis platforms [18]. Loci can be selected for evaluation if they fulfill the following criteria: a repeat unit length of 5 bp or longer; repeated at least three times; 80% internal conservation; and 100% conserved sequences flanking the loci [18,24].

Design of primers

Specific primers can be designed with the assistance of primer design software, which enables the analysis of secondary structures (if present) such as hairpins or self- and cross-dimers that may be formed between the designed primers. Primers should be designed in the conserved sequences flanking the selected potential VNTR loci, at least 40 bp away from the first and last repeats [24]. Thus, sequencing of the flanking regions for each locus can be performed using the same primers. This is to ensure that the designed primers are located in the conserved sequences and to confirm the composition of the VNTR nucleotides. Moreover, it is important to ensure that the same annealing temperature can be used for all loci amplifications if the intention is to multiplex the loci in the same PCR reaction.

Assessment of VNTR candidates by PCR

The designed primers for each potential VNTR locus are assessed by monoplex PCR against a panel of representative diverse genotypes, which have been determined by other subtyping methods. The PCR conditions are optimized to ensure specific and reliable amplification for all selected potential VNTR loci. In cases where there is poor loci amplification, the primers should either be redesigned or excluded from the panel. Furthermore, those potential VNTR loci that show no or minimal diversity should also be excluded after the initial screening of representative strains. VNTR loci that are capable of amplification to various amplicon sizes which enable discrimination among the screened representative strains are then further assessed with a larger panel of strains. The included strains should comprise outbreak-related and epidemiologically unrelated isolates. Thereafter, representative alleles for each VNTR locus should be sequenced to verify the copy number of the repeat array length. This is to ensure that size differences observed between different strains are due to differences in the number of repeat units and not because of other genetic events [18]

MLVA PCR amplification

Amplification of VNTR loci can be performed individually with a pair of primers for a single locus, or multiplexed with more than one pair of primers to amplify more than one locus in a single PCR reaction. The amplified product sizes can be determined by ordinary agarose gel or capillary electrophoresis. Multiplexing the final selected loci, and thus, amplification of more than one VNTR can be carried out simultaneously in a single PCR which may be useful in low resource settings. Various PCR parameters such as the concentrations of the primers, Taq polymerase, MgCl2, and dNTPs, as well as the PCR cycling conditions, are optimized accordingly to obtain a successful result. However, Nadon et al. (2013) have suggested that not more than four or five VNTRs should be amplified simultaneously in a single PCR to ensure the robustness of the assay.

Amplified product sizes can be estimated from a digital image of the agarose gel captured using dedicated software (usually provided with the image documentation system) following electrophoresis. Besides ordinary agarose gel electrophoresis, capillary electrophoresis is an alternative to measure the length of DNA fragments by performing DNA sequencing. MLVA amplicons can be typed individually, or several VNTR loci can be investigated using multi-fluorescent labels. At least four different fluorescent labels can be applied to a single run of capillary electrophoresis. This facilitates the analysis of multiple loci in the same run, ultimately reducing the cost of performing the test. VNTRs with overlapping fragment sizes can be differentiated easily with different fluorescent labels. However, VNTRs with well-defined size ranges can still be labeled with the same fluorescent label multiple times in a single run by carefully designing the primers and choosing the expected sizes.

Evaluation of the MLVA assay

A developed prototype of the MLVA protocol is mandatory for an evaluation of its robustness, reproducibility, and discriminatory power [18]. A larger number of isolates from well-characterized outbreak and sporadic strains should be used to assess the robustness of the assay. Furthermore, the reproducibility of the protocol and the in vitro stability of the VNTR loci should also be assessed though a series of passages using the same strains. Moreover, a comparison between the newly developed MLVA scheme and the gold standard method should be carried out. The discriminatory power of the MLVA method with various combinations of VNTR loci can be assessed and compared by calculating the Simpson’s Index of Diversity (D) and 95% confidence intervals (95% CIs [25,26].

Nomenclatures of MLVA profiles and data analysis

The size of amplicons, the repeat unit lengths, and the number of repetitions can be determined in silico for known genomes that are available at the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/), using the PCR primer BLAST tool from the Microbial Tandem Repeat Database. The number of repeat units for each VNTR allele for unknown genome sequences can be deduced by subtracting the flanking region size from the amplicon size and then dividing by the repeat unit length [24]. The allelic profile is defined as the number of repeat units at each VNTR locus that is included in the MLVA scheme. Null designation is assigned to those strains with no amplification observed at a given locus.

An allele string is assigned for all strains, and is based on the number of repeat units at each allele converted from the fragment size. Data are calculated from the number of repeat units imported into software such as BioNumerics (Applied Math, Belgium) for further epidemiologically relevant analysis.

Figure 3. Flowchart showing the procedures involved in developing a multiple-locus variable-number tandem repeat analysis (MLVA) scheme for bacterial subtyping.

Advantages of MLVA

MLVA is a sequence-based microbial typing tool that generates definitive genetic data from each VNTR locus. The generated data can be uploaded and stored in a public MLVA database, thereby allowing the exchange of data between laboratories throughout the world. Moreover, the results are easy to interpret because they are presented numerically. The PCR-based MLVA assay is easy to perform, which makes it suitable for complete automation in the current era of technology, and can ultimately reduce any errors that may occur during the test. Furthermore, live bacterial isolates or high-quality genomic DNAs are not necessary to produce reliable and reproducible results. Therefore, the difficulties associated with transport and manipulation of pathogens can be avoided. The use of multi-fluorescent dyes facilitates the multiplexing of VNTRs in a single test followed by high-throughput electrophoresis. Thus, a few VNTRs can be amplified and analyzed simultaneously, which demonstrates a fast-typing, high-resolution, high-throughput, and cost-effective genotyping platform.


The MLVA assay, which detects variable repeat loci in microbial genomes, has emerged as a rapid, reproducible, and simple technique for bacterial genotyping, with discriminatory power that is comparable to that of PFGE. It has been used in epidemiological investigations and surveillance to monitor the distribution and spread of bacterial pathogens, as well as to implement public health interventions, if necessary. The MLVA assay can be used in routine clinical microbiology laboratories with standardized methodologies, nomenclature descriptions, and results interpretation. Moreover, the availability of internationally coordinated online databases greatly facilitates efficient data exchange between laboratories in various regions.


  1. A. Van Belkum et al., “Role of Genomic Typing in Taxonomy, Evolutionary Genetics, and Microbial Epidemiology,” Microbiology, vol. 14, no. 3, pp. 547–560, 2001.
  2. B. A. Lindstedt, “Multiple-locus variable number tandem repeats analysis for genetic fingerprinting of pathogenic bacteria,” Electrophoresis, vol. 26, no. 13, pp. 2567–2582, 2005.
  3. K. Zhou, A. Aertsen, and C. W. Michiels, “The role of variable DNA tandem repeats in bacterial adaptation,” FEMS Microbiol. Rev., vol. 38, no. 1, pp. 119–141, 2014.
  4. R. Gemayel, M. D. Vinces, M. Legendre, and K. J. Verstrepen, “Variable tandem repeats accelerate evolution of coding and regulatory sequences.,” Annu. Rev. Genet., vol. 44, pp. 445–477, 2010.
  5. O. J. Rando and K. J. Verstrepen, “Timescales of Genetic and Epigenetic Inheritance,” Cell, vol. 128, no. 4, pp. 655–668, 2007.
  6. A. Van Belkum, “Tracing isolates of bacterial species by multilocus variable number of tandem repeat analysis (MLVA),” FEMS Immunol. Med. Microbiol., vol. 49, no. 1, pp. 22–27, 2007.
  7. A. C. Noller, M. C. Mcellistrem, G. F. Antonio, D. J. Boxrud, L. H. Harrison, and A. G. F. Pacheco, “Multilocus Variable-Number Tandem Repeat Analysis Distinguishes Outbreak and Sporadic Escherichia coli O157 : H7 Isolates Multilocus Variable-Number Tandem Repeat Analysis Distinguishes Outbreak and Sporadic Escherichia coli O157 : H7 Isolates,” J. Clin. Microbiol., vol. 41, no. 12, pp. 5389–97, 2003.
  8. [8] J. Top, L. M. Schouls, M. J. M. Bonten, and R. J. L. Willems, “Multiple-Locus Variable-Number Tandem Repeat Analysis , a Novel Typing Scheme To Study the Genetic Relatedness and Epidemiology of Enterococcus faecium Isolates Multiple-Locus Variable-Number Tandem Repeat Analysis , a Novel Typing Scheme To Study the Gen,” J. Clin. Microbiol., vol. 42, no. 10, pp. 4503–4511, 2004.
  9. C. Pourcel et al., “Identification of Variable-Number Tandem-Repeat (VNTR) sequences in Acinetobacter baumannii and interlaboratory validation of an optimized multiple-locus VNTR analysis typing scheme,” J. Clin. Microbiol., vol. 49, no. 2, pp. 539–548, 2011.
  10. P. Keim et al., “Multiple-locus variable-number tandem repeat analysis reveals genetic relationships within Bacillus anthracis,” J. Bacteriol., vol. 182, no. 10, pp. 2928–2936, 2000.
  11. V. Chenal-Francisque et al., “Optimized multilocus variable-number tandem-repeat analysis assay and its complementarity with pulsed-field gel electrophoresis and multilocus sequence typing for Listeria monocytogenes clone identification and surveillance,” J. Clin. Microbiol., vol. 51, no. 6, pp. 1868–1880, 2013.
  12. F. C. Tenover, R. R. Vaughn, L. K. McDougal, G. E. Fosheim, and J. E. McGowan, “Multiple-locus variable-number tandem-repeat assay analysis of methicillin-resistant Staphylococcus aureus strains,” J. Clin. Microbiol., vol. 45, no. 7, pp. 2215–2219, 2007.
  13.  J.-C. Liao, C.-C. Li, and C.-S. Chiou, “Use of a multilocus variable-number tandem repeat analysis method for molecular subtyping and phylogenetic analysis of Neisseria meningitidis isolates.,” BMC Microbiol., vol. 6, p. 44, 2006.
  14. O. Gorgé, S. Lopez, V. Hilaire, O. Lisanti, V. Ramisse, and G. Vergnaud, “Selection and validation of a multilocus variable-number tandem-repeat analysis panel for typing Shigella spp.,” J. Clin. Microbiol., vol. 46, no. 3, pp. 1026–1036, 2008.
  15. R. De Santis, A. Ciammaruconi, G. Faggioni, R. D’Amelio, C. Marianelli, and F. Lista, “Lab on a chip genotyping for Brucella spp. based on 15-loci multi locus VNTR analysis.,” BMC Microbiol., vol. 9, p. 66, 2009.
  16. F. C. Tenover et al., “Interpreting chromosomal DNA restriction patterns produced by pulsed- field gel electrophoresis: Criteria for bacterial strain typing,” J. Clin. Microbiol., vol. 33, no. 9, pp. 2233–2239, 1995.
  17. B. Swaminathan, T. J. Barrett, S. B. Hunter, and R. V. Tauxe, “PulseNet: the molecular subtyping network for foodborne bacterial disease surveillance, United States.,” Emerg. Infect. Dis., vol. 7, no. 3, pp. 382–389, 2001.
  18.  C. A. Nadon et al., “Development and application of MLVA methods as a tool for inter-laboratory surveillance.,” Euro Surveill., vol. 18, no. 35, p. 20565, 2013.
  19. E. Hyytiä-Trees, K. Cooper, E. M. Ribot, and P. Gerner-smidt, “Recent developments and future prospects in subtyping of foodborne bacterial pathogens,” Future Microbiol., vol. 2, no. 2, pp. 175–185, 2007.
  20. B. A. Lindstedt et al., “Use of multilocus variable-number tandem repeat analysis (MLVA) in eight European countries, 2012.,” Euro Surveill. Bull. Eur. sur les Mal. Transm. = Eur. Commun. Dis. Bull., vol. 18, no. 4, p. 20385, 2013.
  21. Y. Gelfand, A. Rodriguez, and G. Benson, “TRDB - The Tandem Repeats Database,” Nucleic Acids Res., vol. 35, no. SUPPL. 1, pp. 80–87, 2007.
  22. F. Denoeud and G. Vergnaud, “Identification of polymorphic tandem repeats by direct comparison of genome sequence from different bacterial strains: a web-based resource.,” BMC Bioinformatics, vol. 5, p. 4, 2004.
  23. A. Van Belkum, S. Scherer, and L. Van Alphen, “Short-Sequence DNA Repeats in Prokaryotic Genomes Short-Sequence DNA Repeats in Prokaryotic Genomes,” vol. 62, no. 2, pp. 275–293, 1998.
  24. G. Vergnaud and C. Pourcel, “Multiple Locus Variable Number of Tandem Repeats Analysis,” Life Sci., vol. 551, no. 1, p. 588, 2009.
  25. P. R. Hunter, “Reproducibility and indices of discriminatory power of microbial typing methods,” J. Clin. Microbiol., vol. 28, no. 9, pp. 1903–1905, 1990.
  26. H. Grundmann, S. Hori, and G. Tanner, “and the Discriminatory Abilities of Typing Methods for Microorganisms,” J. Clin. Microbiol., vol. 39, no. 11, pp. 4190–4192, 2001.