
Integrons are a flexible and fast-evolving part of microbial genomes, and their associated cassette arrays represent a unique and segregated gene pool (the genes they carry are rarely found outside these genetic elements). This is because attC sites, the most distinctive feature of gene cassettes, are non-coding regions and therefore, not recognized by standard automated genome annotation pipelines. Even when the integron integrase gene is annotated due to its sequence similarity to characterized homologs, the gene cassettes associated with it are labeled as simple open reading frames (ORFs). Despite the fact that this genetic element is found in about 10% of all sequenced genomes and that cassette arrays can be as large as 150 kb, few integrons have been properly identified and annotated as such. The advent of genomics and the availability of numerous genome sequences from environmental bacteria made it clear that the integron is a more ancient and widespread gene capture system. Initially, integrons were thought of as specialized elements mostly involved in the accumulation of gene cassettes encoding antibiotic resistance determinants in pathogenic bacteria. The result is an assembly of functionally interacting genes theoretically facilitating the rapid evolution of new phenotypes. The ability to capture disparate individual genes and physically link them in arrays suitable for co-expression is a trait unique to this genetic element.

A promoter, P c, often located upstream from attI, is thought to enhance expression of proximal cassette-associated genes in some integrons. Such capture events can occur repeatedly and, in the case of some chromosomal integrons, this process can lead to the creation of large arrays encoding hundreds of gene cassettes. These captured cassettes are most commonly inserted by this recombination activity at the integron attachment site ( attI) (Figure 1). The integron captures gene cassettes through site-specific recombination carried-out by the encoded tyrosine recombinase (IntI). They comprise gene(s) associated with a recombination site most commonly referred to as attC and less commonly as 59-base elements (59-be). Gene cassettes are one of the simplest known mobile elements. These genetic elements can perform acquisition, rearrangement and expression of genetic material that is part of gene cassettes. Integrons were discovered about two decades ago as a result of their role in the evolution of multi-drug-resistant bacteria. ACID also hosts a forum to prompt integron-related discussion, which can hopefully lead to a more universal definition of this genetic element.

ConclusionĪCID is a community resource providing easy access to annotations of integrons and making tools available to detect them in novel sequence data. Users can readily annotate their own data and integrate it into ACID using the tools provided.

ACID (annotation of cassette and integron data) can be searched using a range of queries and the data can be downloaded in a number of formats. Specialists manually curated the database and this information was used to improve the automated detection and annotation of integrons and their encoded gene cassettes. Descriptionīy automating the identification of integron integrase genes and of the non-coding cassette-associated attC recombination sites, we were able to assemble a database containing all publicly available sequence information regarding these genetic elements. These genetic elements have been overlooked in comparison to other vectors that facilitate lateral gene transfer between microorganisms. Although integrons and their associated gene cassettes are present in ~10% of bacteria and can represent up to 3% of the genome in which they are found, very few have been properly identified and annotated in public databases.
