In order to easily compare the characteristic groove domain (G-DOMAIN) of the major histocompatibility complex (MHC) chains from vertebrates, and the G-LIKE-DOMAINs of the MHC superfamily (MhcSF) proteins of all species, an IMGT unique numbering has been defined [1]. The IMGT unique numbering for G-DOMAIN is part of the 'NUMEROTATION' concept of IMGT-ONTOLOGY [2].
IMGT poster:
MHC and MHC-like domains: standardization based on the IMGT-ONTOLOGY NUMEROTATION concept
This IMGT unique numbering relies on the high conservation of the structure of the G-DOMAIN.
The G-DOMAIN corresponds to:
The IMGT unique numbering for G-DOMAIN also applies to G-LIKE-DOMAIN. A G-LIKE-DOMAIN is a groove domain of similar 3D structure as the G-DOMAIN, found in chains other than MHC.
'IMGT Protein display for G domain' header:
A MHC class I protein comprises one MHC-I-Alpha (I-ALPHA) chain and one Beta2-Microglobulin (B2M) chain. The I-ALPHA chain consists of three domains: [D1] and [D2] are G-DOMAINs, and domain [D3] is a C-LIKE-DOMAIN. The B2M chain consists of a single C-LIKE-DOMAIN.
A MHC class II protein comprises one MHC-II-Alpha (II-ALPHA) chain and one MHC-II-Beta (II-BETA) chain. Each chain consists of two domains: domain 1 is a G-DOMAIN and domain 2 is a C-LIKE-DOMAIN.
The G-DOMAIN consists of one sheet of four antiparallel beta strands ("floor" of the groove or platform) and one long helical region ("wall" of the groove). This groove is part of the cleft that is the peptide binding site of the classical MHC-Ia and MHC-IIa proteins.
The G-DOMAINs are designated as:
The IMGT numbering for G-DOMAIN was established by extensive sequence alignment comparison of annotated MHC chains from the IMGT Repertoire (MH) and by structural data anlysis and alignment of MHC proteins with known 3D structures from IMGT/3Dstructure-DB [1]. As each G-DOMAIN is usually encoded by a single exon, the delimitation of the domains in IMGT takes into account the limits of the exons in the genomic structure of the MHC genes.
For each G-DOMAIN, the positions that contribute to the groove floor comprise IMGT positions 1 to 49, with the A strand from positions 1 to 14, the AB turn positions 15 to 17, the B strand positions 18 to 28, the BC turn positions 29 and 30, the C strand positions 31 to 38, the CD turn positions 39 to 41 and the D strand positions 42 to 49. The additional position 7A represents a bulge in 3D structures and is present in some G-ALPHA domains. Additional positions at the N-terminus of strand A or at the C-terminus of strand D can be added if necessary [2].
For each G-DOMAIN, the positions that contribute to the helix comprise IMGT positions 50 to 92. The numbering of the alpha helix starts at position 50 and ends at position 92, with five additional positions at 54A, 61A, 61B, 72A and 92A. Three of them (61A, 61B, 72A) characterize the G-ALPHA2 and/or G-BETA domains. The position 92A is occupied, for example in human and mouse, in the HLA-DMA ans H2-DMA G-ALPHA domains. It is worthwhile to note that position 54A is the only additional position needed to extend the IMGT numbering for G-DOMAINs to the G-LIKE-DOMAINs of the MHC-I-like proteins.
| IMGT G-DOMAIN strand, turn and helix |
Groove floor | Helix | ||||||
|---|---|---|---|---|---|---|---|---|
| A strand |
AB turn |
B strand |
BC turn |
C strand |
CD turn |
D strand |
||
| Amino acid numbering |
1.9-1.1 1-14 |
15-17 | 18-28 | 29-30 | 31-38 | 39-41 | 42-49 49.1-49.5 |
50-92 |
The G-LIKE-DOMAINs are designated as:
The IMGT unique numbering for G-LIKE-DOMAIN (proteins other than MHC) follows exactly the same rules are those of the G-DOMAIN. As mentioned above, position 54A is the only additional position needed to extend the IMGT numbering for G-DOMAINs to the G-LIKE-DOMAINs of the MHC-I-like proteins.
The IMGT unique numbering for the G-LIKE-DOMAIN allows, for the first time, to compare any G-DOMAIN and G-LIKE-DOMAIN of any protein of the MHC superfamily (MhcSF).
| [1] | Lefranc, M.-P. et al., Dev. Comp. Immunol., 29, 917-938 (2005)
PMID: 15936075,
LIGM:296
|
| [2] | Giudicelli, V. and Lefranc, M.-P., Bioinformatics, 15, 1047-1054 (1999)
PMID:10745995,
LIGM:221
|