IMGT/GENE-DB is part of IMGT®, the international ImMunoGeneTics information system®, the high-quality integrated knowledge resource specializing in the immunoglobulins (IG) or antibodies, T cell receptors (TR), and major histocompatibility (MH) of human and other vertebrate species, proteins of the immunoglobulin superfamily (IgSF) and MH superfamily (MhSF), related proteins of the immune systems (RPI) of vertebrates and invertebrates, therapeutic monoclonal antibodies (mAb), and fusion proteins for immune applications (FPIA), created in 1989 by Marie-Paule Lefranc (LIGM, Université Montpellier 2, CNRS).
IMGT/GENE-DB is the IMGT genome database for IG and TR genes from human, mouse and other vertebrates, on the Web since February 2003.
IMGT/GENE-DB provides a full characterization of the genes and of their alleles: IMGT gene name and definition, chromosomal localization, number of alleles,
and for each allele, the IMGT allele functionality, and the IMGT reference sequences and other sequences from the literature.
IMGT/GENE-DB allele reference sequences are available in FASTA format
(nucleotide and amino acid sequences with IMGT gaps according to the IMGT unique numbering, or without gaps).
IMGT/GENE-DB includes links to the IMGT Repertoire standardized resources (Chromosomal localization, Locus representation,
Tables of alleles, Alignments of alleles, IMGT Protein displays, IMGT Colliers de Perles, etc.), to the IMGT/LIGM-DB and
IMGT/3Dstructure-DB structures and IMGT/2Dstructure-DB IMGT databases.
IMGT/GENE-DB is the official repository of all of the IG and TR genes and alleles approved by the World Health Organization (WHO)/International Union of Immunological Societies (IUIS) Nomenclature Subcommittee for IG and TR (Lefranc 2007, 2008a). Reciprocal links exist between IMGT/GENE-DB and the Human Genome Nomenclature Committee (HGNC) database, NCBI Gene at the National Center for Biotechnology Information (NCBI).
The IMGT/GENE-DB Query page shows, on the top right, the status of the database (current date, number of genes, number of alleles and number of species).
Searches in IMGT/GENE-DB are performed according to the concepts of IDENTIFICATION, LOCALIZATION and CLASSIFICATION of IMGT-ONTOLOGY.

As Main Locus may contain RPI genes, a selection on 'Molecular component' 'IG' or 'TR' prior to the request will allow to only retrieve IG or TR genes.
Note that the search is case sensitive and that UPPERcase is the rule.
You can also enter only the first letters of the IMGT gene name: for example the selection of IGHV will list in the next page all genes which have an IMGT gene name beginning with IGHV.
You can consult the Correspondence between nomenclatures.
Provides a set of direct links to query IMGT/GENE-DB according to an IMGT gene name, an IMGT group or to get the links to IMGT/GENE-DB and generalist genomic databases.
Depending on the number of resulting genes, you will see:

At the top of the page, the selected criteria are indicated with the number of resulting genes and the number of resulting alleles.
The list of resulting genes is a table with the following columns:
IMGT gene names
Functionality
IMGT gene definition
Number of alleles
Chromosomal localization
IMGT/LIGM-DB reference sequence for allele *01
Molecular component 
"Complete IMGT/GENE-DB entries" is selected by default. It displays the detailed results
for the selected genes
(see IMGT/GENE-DB DETAILED RESULTS).
"IMGT/GENE-DB reference sequences in FASTA format" for the selected genes corresponds to :
"IMGT label extraction from IMGT/LIGM-DB reference sequences"
allows to extract, from the IMGT/LIGM-DB reference sequences, and for each allele of the selected gene(s), the sequences corresponding to
one or several IMGT labels and/or artificially spliced exons.
The list of IMGT/LIGM-DB labels is available here.
Note that in field #8 (codon start) of the FASTA header:
- '1' indicates that the nucleotide resulting from the splicing with the J gene (first 5' nucleotide of the first codon) is present (cDNA sequence). In that case,
artificially spliced exons with CHS and artificially spliced exons with membrane exon(s) correspond to C-REGION.
- '3' indicates that the nucleotide resulting from the splicing with the J gene is absent (gDNA sequence). In that case,
artificially spliced exons with CHS and artificially spliced exons with membrane exon(s) correspond to C-REGION
minus the first 5' nucleotide of the first codon resulting from the splicing with the J gene.
Note that in field #8 (codon start) of the FASTA header:
- '1' indicates that the nucleotide resulting from the splicing with the J gene (first 5' nucleotide of the first codon) is present (cDNA sequence). In that case,
artificially spliced exons correspond to C-REGION.
- '3' indicates that the nucleotide resulting from the splicing with the J gene is absent (gDNA sequence). In that case,
artificially spliced exons correspond to C-REGION
minus the first 5' nucleotide of the first codon resulting from the splicing with the J gene.
Note that these functionalitities are not yet available for conventional genes
Click here for examples of results.
|
|


IMGT gene name and definition
Chromosomal localization
Localizations in genome assemblies
Number of alleles
IMGT reference alleles
Provides a table in which are listed all identified alleles. For each allele are indicated:
Below the IMGT reference alleles table, a second table provides links to display the IMGT/GENE-DB reference sequences in FASTA format .

V-REGION
- F+ORF+all P: provides the nucleotide sequences of V-REGION for functional, ORF and all pseudogene alleles of the gene(s).
- F+ORF+in-frame P: provides the nucleotide and amino acid sequences of V-REGION for functional, ORF and in-frame pseudogene alleles of the gene(s). The nucleotide sequences and the amino acid sequences are provided with IMGT gaps according to the IMGT unique numbering (IMGT Scientific chart) .
L-PART1+V-EXON
- F+ORF+all P: provides the nucleotide sequences of the artificially spliced L-PART1 and V-EXON for functional, ORF and all pseudogene alleles of the gene(s).
- F+ORF+in-frame P: provides the amino acid sequences of the artificially spliced L-PART1 and V-EXON for functional, ORF and in-frame pseudogene alleles of the gene(s).

- F+ORF+all P: provides the nucleotide sequences of D-REGION or J-REGION for functional, ORF and all pseudogene alleles of the D or J gene(s) respectively.
- F+ORF+in-frame P: provides the amino acid sequences of D-REGION or J-REGION for functional, ORF and in-frame pseudogene alleles of the D or J gene(s) respectively.
Note that the J-REGION in cDNA and gDNA differ by one nucleotide in 3'.
In FASTA format, this nucleotide is restored if the reference sequence is from cDNA.

Individual constant exon(s)
- F+ORF+in-frame P: provides the nucleotide sequences of individual constant exon(s) for functional, ORF and in-frame pseudogene alleles of the C gene(s).
- F+ORF+in-frame P with IMGT gaps: provides the nucleotide and amino acid sequences with gaps of individual constant exon(s) for functional, ORF and in-frame pseudogene alleles of the C gene(s). Gaps are according to the IMGT unique numbering (IMGT Scientific chart) .
Note that:
IMGT gaps
Gaps of the IMGT/GENE-DB reference sequences with IMGT gaps are shown for the positions unoccupied based on the IMGT unique numbering 'for C-DOMAIN' (see 'Range of strand, turn and loop lengths in C-DOMAIN and C-LIKE-DOMAIN' https://www.imgt.org/IMGTScientificChart/Numbering/IMGTIGVCsuperfamily.html).Artificially spliced exon(s)
- F+ORF+in-frame P: provides the nucleotide and amino acid sequences of the artificially spliced exons for functional, ORF and in-frame pseudogene alleles of the C gene(s).
Note that the sequences include one nucleotide from the upstream donor exon,
added in 5' to obtain a complete first codon.
Other sequences from the literature (compiled in IMGT gene tables, IMGT Repertoire)
IMGT Repertoire links
Annotated IMGT/LIGM-DB cDNA sequences 
Annotated IMGT/LIGM-DB rearranged genomic DNA sequences
Annotated IMGT/3Dstructure-DB structures
External links
"IMGT label extraction from IMGT/LIGM-DB reference sequences" is one of the three
choices of Choose your display in RESULTS OF YOUR SEARCH.
It provides, for each allele of the selected gene(s), in FASTA format, the nucleotide sequences or the amino acid sequences
corresponding to the selected label(s) extracted from the IMGT/LIGM-DB reference sequences.
Nucleotide sequences are provided for F+ORF+all P alleles.
Amino acid sequences are provided for F+ORF+in-frame P alleles.
Note that the FASTA header is standardized according to FASTA format of IMGT/GENE-DB reference sequences.
In addition, in case of extension with nucleotides in 5' and/or in 3', the added nucleotides in 5' and in 3' are indicated in the field 6 of the FASTA header
(see example)
Example of extraction of the FR3-IMGT label and the L-PART1+V-EXON artificially spliced label in nucleotides

Example of extraction of the L-PART1 label in nucleotides with extension of 5 nucleotides in 5' and 30 nucleotides in 3' (see Choose your display)
Note that the number of added nucleotides in 5' and in 3' are indicated in the field 6 of the FASTA header.

Example of extraction of the L-PART1+V-EXON artificially spliced label in amino acids

The genomic localizations of IMGT genes are provided according to the selection : Species, Locus, Assembly, Assembly unit and Designation.
Note that for Mus musculus (mouse) locus, the information provided is for IMGT allele *01.
Note that for Mus musculus (mouse) locus, positions are those provided by NCBI. For V genes, positions correspond to L-PART1+V-INTRON+V-EXON.
The FASTA header of IMGT/GENE-DB reference sequences is standardized. It contains 15 fields separated by '|':
1. IMGT/LIGM-DB accession number(s)
2. IMGT gene and allele name
3. species
4. IMGT allele functionality
5. exon(s), region name(s), or extracted label(s)
6. start and end positions in the IMGT/LIGM-DB accession number(s)
7. number of nucleotides in the IMGT/LIGM-DB accession number(s)
8. codon start, or 'NR' (not relevant) for non coding labels
9. +n: number of nucleotides (nt) added in 5' compared to the corresponding label extracted from IMGT/LIGM-DB
10. +n or -n: number of nucleotides (nt) added or removed in 3' compared to the corresponding label extracted from IMGT/LIGM-DB
11. +n, -n, and/or nS: number of added, deleted, and/or substituted nucleotides to correct sequencing errors, or 'not corrected' if non corrected sequencing errors
12. number of amino acids (AA): this field indicates that the sequence is in amino acids
13. number of characters in the sequence: nt (or AA)+IMGT gaps=total
14. partial (if it is)
15. reverse complementary (if it is)
Note that the field 6. may be modified if:
Four examples are displayed below:
- Nucleotide sequences with IMGT gaps
- Amino acid sequences with IMGT gaps
- Nucleotide sequences (without gaps)
- Amino acid sequences (without gaps)
An IMGT/GENE-DB reference sequence for a given IG or TR gene is provided in the 5' > 3' DNA strand orientation corresponding to the 'sense', 'plus' or 'coding strand' of that gene (DNA strand orientation).
The orientation (direct or opposite) of an IG or TR gene in a given IMGT locus is given in Locus Gene order (Genomic orientation)
IMGT Repertoire (IG and TR) > 1. Locus and genes > 3. Locus descriptions > Locus gene order
© Copyright 1995-2025 IMGT®, the international ImMunoGeneTics information system® | Terms of use | About us | Contact us | Citing IMGT