IMGT/HighV-QUEST version, IMGT/V-QUEST version and IMGT/V-QUEST reference directory release are indicated at the top of the IMGT/HighV-QUEST Home page.
Be aware that this information should always be checked. Indeed, for unforeseen delays, the IMGT/HighV-QUEST portal may not necessarily use the most recent IMGT/V-QUEST version (IMGT/V-QUEST program versions) and/or IMGT/V-QUEST reference directory (IMGT/V-QUEST reference directory releases).

Citing IMGT/HighV-QUEST:
Alamyar, et al. IMGT/HighV-QUEST: The IMGT® web portal for immunoglobulin (IG) or antibody and T cell receptor (TR) analysis from NGS high throughput and deep sequencing. Immunome Res. 8:1:2 (2012). LIGM:400 PMID:22647994 pdf
Alamyar E., et al., Methods Mol. Biol. 882:569-604 (2012). PMID:22665256 LIGM:404
Li S., et al. IMGT/HighV QUEST paradigm for T cell receptor IMGT clonotype, clonal expression evaluation diversity and next generation repertoire immunoprofiling. Nat. Commun. 4:2333 (2013). Open access. PMID:23995877 LIGM:419
Giudicelli V., et al., Autoimmun Infec Dis. 1(1) (2015). doi:10.16966/aidoa.103. Free Article LIGM:448
IMGT/HighV-QUEST
Introduction

IMGT/HighV-QUEST analysisIMGT/HighV-QUEST statistical analysis
IMGT/HighV-QUEST upgrades and versions

Introduction

IMGT/HighV-QUEST [1] (see also [2-5]) is the web portal of IMGT® [6], the international ImMunoGeneTics information system® (http://www.imgt.org) for the analysis of rearranged nucleotide sequences of the antigen receptors (immunoglobulins (IG) or antibodies and T cell receptors (TR)) [7, 8, 9] obtained from next generation sequencing (NGS), based on IMGT-ONTOLOGY [10] and the immunoinformatics IMGT scientific rules [11].

IMGT/HighV-QUEST [1, 3, 4] is the high-throughput version of IMGT/V-QUEST [12, 13] and can analyse thousands of immunoglobulin (IG) and T cell receptor (TR) rearranged nucleotide sequences (up to 500 000 sequences) per run.

IMGT/HighV-QUEST uses the IMGT/V-QUEST program to analyse user sequences [5] and the IMGT/V-QUEST reference directory [6] (IMGT/V-QUEST documentation). IMGT/HighV-QUEST outputs have a content similar to that of IMGT/V-QUEST [5].

IMGT/HighV-QUEST statistical analysis is performed on IMGT/HighV-QUEST results selected by the user and include the characterization of the IMGT clonotype (AA) diversity and expression [4] and their comparison in up to one million sequences.

Note that:
  • All IMGT/HighV-QUEST users should be registered in order to be able to use this platform.
  • IMGT/HighV-QUEST works with rearranged V-J and V-D-J sequences and germline V-GENE, but does not work with germline D-GENE or J-GENE.
  • IMGT/HighV-QUEST can analyse sequences with DNA insertions or deletions (which do not respect the IMGT unique numbering). For more information, see 'Search for insertions and deletions'.
  • IMGT/HighV-QUEST can analyse sequences containing two V domains (as scFv) if the option ‘Analysis of single chain Fragment variable (scFv)’ is selected in Advanced functionalities of the IMGT/HighV-QUEST Search page.
  • IMGT/HighV-QUEST does not work with out-of-frame pseudogenes as they cannot be numbered according to the codon (or amino acid) IMGT unique numbering [14, 15]. You may use IMGT/BlastSearch (http://www.imgt.org/BlastSearch) in order to compare out-of-frame sequences against 'F+ORF+inframeP' genes and alleles IMGT reference sequences (select for 'Database' : 'IMGT/GENE-DB reference sequences').
  • IMGT/HighV-QUEST does not work, or will give aberrant results, for too short partial sequences, sequences containing a cluster of V-GENE, or sequences with too long 5'UTR or 3'UTR.
    For these sequences, you may use IMGT/BlastSearch (http://www.imgt.org/BlastSearch) in order to identify the closest sequences from IMGT/LIGM-DB (select for 'Database' : 'IMGT/LIGM-DB') .

IMGT/HighV-QUEST analysis



IMGT/HighV-QUEST submission

Once registered and registration accepted by the IMGT/HighV-QUEST administrator, users can log in to IMGT/HighV-QUEST using their e-mail address and the password they selected during the registration.
The first page that a user may see after logging in is the IMGT/HighV-QUEST Search Page.

IMGT/HighV-QUEST Search page comprises:

- Selection and input
- Display results
- Advanced parameters
    1. Selection and input

      Users must enter:

      - a title for their new analysis (50 characters or less).

      and select:

      - the species
      - the antigen receptor type (IG or TR) or the locus (for example, IGL)
      - the path to their local file to be submitted.

      Since IMGT/HighV-QUEST is designed for batches of large number of sequences, there is no copy/paste submission. The submitted file should contain IG or TR rearranged nucleotide sequences in FASTA format. The file must be formatted in text only (RTF or DOC formats are not accepted).

      Here are some examples of sequence files in FASTA format :
      Click here to get a FASTA file containing human IG sequences
      Click here to get a FASTA file containing human TR sequences

      Other sets of sequences to test the IMGT/HighV-QUEST functionalities are available here.

      Users can choose to receive an e-mail notification :
      • when the analysis is queued in the local analysis queue
      • when the analysis is submitted for computer processing
      • when the analysis is completed and the results can be downloaded
      • 5 days before the analysis is expired:
        15 days after their availability, the results will be definitively removed from the server. The expiration notification is not optional.



    2. Display results


      The 'Display results' choice is identical to that of IMGT/V-QUEST (click here to see this part in IMGT/V-QUEST documentation).

      'Detailed view' will provide individual files (one file for each sequence analysed) as an option for submission < 150 000 sequences (click 'yes' for selecting the option). Individual files are not provided for submission > 150 000 sequences.

      'Files in CSV' will provide outputs with a content similar to that of the IMGT/V-QUEST Excel file sheets [9]. The 11 CSV files are selected by default. They should be kept selected if IMGT/HighV-QUEST statistical analysis is performed.



    3. Advanced parameters


      The 'Advanced parameters' selection is identical to that of IMGT/V-QUEST (click here to see this part in IMGT/V-QUEST documentation).

      Note that: The option "Search for insertions and/or deletions" of Advanced parameters is selected by default.






IMGT/HighV-QUEST reference directory



IMGT/HighV-QUEST uses the IMGT/V-QUEST reference directory sets [2] to analyse user sequences (click here to see this part in IMGT/V-QUEST documentation).


IMGT/HighV-QUEST results




IMGT/HighV-QUEST statistical analysis


      Note that: IMGT/StatClonotype works only with uploaded files (.txt format) from the IMGT/HighV-QUEST statistical analysis output ('stats_xxx' file available in 'data' directory of the IMGT/HighV-QUEST statistical analysis output where 'xxx' is the batch name and the locus type) as described IMGT/StatClonotype documentation. To get these files, you therefore need to launch the IMGT/HighV-QUEST statistical analysis.


IMGT/HighV-QUEST statistical analysis terminology

    1. 'Single IMGT V gene and allele name' used in IMGT/HighV-QUEST statistical analysis

      Some V genes and alleles are always found with identical results using IMGT/V-QUEST or IMGT/HighV-QUEST. This happens:
      • when the alleles only differ between them by nucleotide (nt) differences in 3' of the last codon of the V-REGION taken into account for the evaluation of the alignment score and closest GENE and allele identification (according to the IMGT unique numbering, codon 104 for IGH and the TR loci, codon 109 for IGK and IGI, and codon 110 for IGL),
      • when the alleles belong to duplicated genes with identical nucleotide sequences in the V-REGION analysed by IMGT/V-QUEST or IMGT/HighV-QUEST.

      In order to avoid the filtering-out of these sequences, the IMGT V gene and allele is assigned to a 'Single IMGT V gene and allele name' used in IMGT/HighV-QUEST statistical analysis (click here for the list).

    2. Filtered-in and filtered-out sequences

      Total: sequences of the IMGT/HighV-QUEST output selected by the user and submitted to the statistical analysis.

      '1 copy': sequences in one copy, and therefore different by their length and/or their sequence, and retained in 'filtered-in' sequences.
      For each set of identical sequences, only one copy is retained in '1 copy' and the other redundant sequences for that copy are put into 'More than 1'.

      The following six categories are excluded from statistical analysis (filtered-out sequences):

      'More than 1': identical sequences (after that one copy of each one of identical sequences has been retained in '1 copy'). The 'More than 1' are excluded from the per se statistical analysis to avoid redundancy, the number of 'More than 1' being added to the corresponding '1 copy' ONLY at the end of statistical analysis.

      'No J-GENE': sequences for which IMGT/HighV-QUEST did not find any J-GENE, usually these sequences are very short in 3'.

      'No junction': sequences for which a junction could not be identified (e.g. no evidence of anchors).

      'Warnings': sequences with warnings for the V-REGION ('different CDR lengths' and/or 'id<85%').

      In the Warnings files:
      'different CDR lengths' means sequences with different AA lengths for CDR1-IMGT and/or CDR2-IMGT compared to the lengths of the CDR1-IMGT and/or CDR2-IMGT, respectively of the closest identified germline V gene and allele.
      'id<85%' means sequences with a V-REGION having a percent of identity <85% compared to the V-REGION of the closest identified germline V gene and allele.

      'Unknown functionality': sequences for which no functionality was detected.
      IMGT/HighV-QUEST is intended for the analysis of rearranged IG and TR sequences. The functionality identified by IMGT/HighV-QUEST is either 'productive' (no stop codon and in-frame junction) or 'unproductive' (stop codons, out-of-frame junction).

      The statistical analysis is performed on the '1 copy' category divided in two sets, depending on the IMGT/HighV-QUEST results:

      'single gene': only one gene identified by IMGT/HighV-QUEST.
      'Single gene' refers to V, D and J analysed separately or in combination.

      'several genes': several genes identified by IMGT/HighV-QUEST.
      'Several genes' refers to V, D and J analysed separately or in combination.

      'single allele': only one gene and allele identified by IMGT/HighV-QUEST.
      'Single allele' refers to V, D and J analysed separately or in combination.

      'several alleles (or genes)': several alleles (or genes) identified by IMGT/HighV-QUEST.
      'Several alleles (or genes)' refers to V, D and J analysed separately or in combination.

    3. Definition of an 'IMGT clonotype (AA)'

      An 'IMGT clonotype (AA)' is defined by a unique V-(D)-J rearrangement (with IMGT gene and allele names determined by IMGT/HighV-QUEST at the nucleotide level) and a unique CDR3-IMGT AA (in-frame) junction sequence.

      Sequences assigned to an IMGT clonotype (AA) comprise:

      1. 'single allele' sequences with the same V and J genes and alleles and same CDR3-IMGT (AA).
      2. 'several alleles (or genes)' sequences with the same V and J genes and alleles among the different identified V and J genes and alleles and the same CDR3-IMGT (AA).
      3. sequences in 'More than 1' which have their '1 copy' among the sequences assigned to a given IMGT clonotype (AA).

      All sequences assigned to IMGT clonotypes (AA) are in-frame and have the conserved two anchors C104 and F/W118 (for example, F for TRB, W for IGH), V and J functional or ORF.

      IMGT clonotype (AA) representative sequence: each IMGT clonotype (AA) has a representative sequence chosen amongst the assigned sequences, the longest one and with the highest percentage of identity with the germline V gene and allele.



    4. Definition of an 'IMGT clonotype (nt)'

      An IMGT clonotype (nt) is defined by a unique V-(D)-J rearrangement (with IMGT gene and allele names determined by IMGT/HighV-QUEST at the nucleotide level) and a unique CDR3-IMGT nt (in-frame) junction sequence.

      Several IMGT clonotypes nt may correspond to one IMGT clonotype (AA) if the CDR3-IMGT differ by one or more nucleotides from the CDR3-IMGT of the representative nucleotide sequence of the IMGT clonotype (AA).

IMGT/HighV-QUEST statistical analysis submission

  1. Prerequisite

  2. IMGT/HighV-QUEST statistical analysis can only be performed on results that include at least the first 11 (or 12, if scFv) CSV files (they should have been selected in 'Files in CSV' at the IMGT/HighV-QUEST Search page when launching the IMGT/HighV-QUEST analysis as shown in the screen capture below). For scFv, IMGT/HighV-QUEST statistics are provided per V-DOMAIN, independently.



  3. Statistical analysis submission

  4. Users must enter:

    - a title for the statistical analysis
    - define and select the batch or batches for the IMGT/HighV-QUEST statistical analysis. The total of sequences for a single batch is 500,000. The total of sequences for multiple batch is 1,000,000.
    - Batch, is selected one by one from the user's 'Analysis history' page (the receptor type of all selected batches should be on the same locus)
    - a batch ID for each analysis selected.



    The statistical analysis can then be launched.

IMGT/HighV-QUEST statistical analysis output

    The IMGT/HighV-QUEST statistical output is provided as a txz file. After extraction of the archive, open the file "open_to_start.html" with a Web browser.
    The IMGT/HighV-QUEST statistical output is organized in 6 sections :

    1. 'Selected parameters' and 'Batch list table'
    2. Result summary for batches
    3. Result summary for IMGT clonotypes (AA)
    4. Detailed results per batch
    5. IMGT clonotype (AA) results comparison
    6. 'data' directory



    1. 'Selected parameters' and 'Batch list table'

      The 'Selected parameters' table recapitulates the parameters selected by the user at the statistical analysis submission. It provides the title, the species, the receptor type (or locus), IMGT reference directory set, Search for insertions/deletions ('yes' or 'no'), Nb of sequences (sum of all batches), Batch IMGT clonotype comparison ('yes' or 'no'), User notification (complete and/or expire) (complete: when the analysis is completed, expire: 15 days after the completion date), Analysis date (date and time), and Comments (submitted by the user).

      The 'Batch list table' displays for each batch, the batch ID (in orange) and as many lines as the number of IMGT/HighV-QUEST outputs selected by the user for a given batch. Each line displays the title of the selected IMGT/HighV-QUEST output, with the Nb of sequences, Species, Receptor type (or locus), IMGT/HighV-QUEST version, IMGT/V-QUEST version and IMGT/V-QUEST reference directory release.



    2. Result summary for batches

      The 'Result summary for batches' table recapitulates, for each batch, the following:
      - Batch ID, Total (Nb of sequences in the selected batch) with average length (Avr Len).
      - then the different IMGT/HighV-QUEST sequence categories '1 copy', '1 copy' with indels, 'More than 1', 'More than 1' with indels, No J-GENE, No junction,Warnings, Unknown functionality, No results, with for each category, the number of sequences and the average length.



    3. Result summary for IMGT clonotypes (AA)

      The 'Results summary for IMGT clonotypes (AA) table recapitulates for each batch (column 1), and then for each locus (column 2) the following:
      - Nb of IMGT clonotypes (AA) equal to the Nb of IMGT clonotypes (AA) representatives sequences),
      - Nb of sequences assigned to IMGT clonotypes (AA) (per definition in-frame), Nb of in-frame sequences not assigned to IMGT clonotypes (AA) (e.g. anchor AA changes), Nb of in-frame sequences,
      - Nb of in-frame productive sequences (no stop codons), Nb of in-frame unproductive sequences (stop codons),
      - Nb of out-of-frame sequences,
      - Nb of sequences '1 copy' + 'More than 1' (= Nb of analysed sequences),
      - Nb of 'single gene' and Nb of 'several genes'
      - then Nb of submitted sequences per batch.
      The percentage of nb of sequences '1 copy' + 'More than 1' is calculated versus the nb of submitted sequences. The percentage of all other columns is calculated versus nb of sequences '1 copy' + 'More than 1'.



    4. Detailed results per batch

        They comprise two types of results:
        'Results categories' and V, D, J genes and alleles for genotype analysis ('1 copy' 'single gene' for V and J)
        IMGT clonotype (AA) and (nt) analysis results

          4.1 'Results categories' and V, D, J genes and alleles for genotype analysis ('1 copy' 'single gene' for V and J)

          4.1.1 Overview

          Results are archived in a single TXZ file called (stat_version_1.txz). A TXZ file is provided for each batch. When extracted, the TXZ file of each batch contains:
          • 5 reports in PDF
          • 1 README file
          • 1 'graphics' folder containing separate copy of graphical elements in PNG





          4.1.2 Content of the IMGT reports

          The content of the IMGT reports includes 9 sections:

          1. Comments
          2. Analysis list
          3. Summary table
          4. Terminology
          5. Number of '1 copy' with 'single gene' and 'several genes' (for V, D and/or J) tables and histograms
          6. '1 copy' with 'single gene' tables and histograms
          7. '1 copy' with 'several alleles (or genes)' gene and allele tables
          8. Sequences in 'More than 1'
          9. Other filtered-out sequences

          All sections are found in report 1 whereas reports 2 to 5 contains only part of them.

          1_IMGT_report_all.pdf: sections 1 to 9
          2_IMGT_report_summary.pdf: sections 1 to 5
          3_IMGT_report_1copy_single-gene.pdf: section 6
          4_IMGT_report_1copy_several-genes.pdf: section 7
          5_IMGT_report_filtered-out_sequences.pdf: sections 8 and 9




          1. Comments
            Comments are those added by the user in "Batch comments (optional)". In normal conditions the PDF documents are not editable, therefore, this functionality was added in IMGT/HighV-QUEST to give users the possibility to include some optional comments in the final report in order to be able to recognize it later.

          2. Analysis list
            'Analyses list' recapitulates the list of IMGT/HighV-QUEST sets analysed with title, Nb of sequences, IMGT/HighV-QUEST reference directory (species and receptor type or locus), IMGT/HighV-QUEST version, IMGT/V-QUEST version and IMGT/V-QUEST reference directory release.

            Note that: IMGT/HighV-QUEST and the IMGT/V-QUEST versions and the IMGT/V-QUEST reference directory release are important information for statistical analysis. To check the details of each version and upgrade:
            IMGT/HighV-QUEST Upgrades and versions, IMGT/V-QUEST reference directory releases, IMGT/V-QUEST program versions pages.




          3. Summary table
            The Summary table shows the chosen parameters for the statistical analysis and the categories of sequences as identified and filtered by IMGT/HighV-QUEST with Nb of sequences abd sequence average length (nt) for each category. The IMGT/HighV-QUEST statistical analysis is performed only on the filtered-in sequences ('1 copy').

            Statistical analysis is done on '1 copy'. The 'More than 1' sequences are aggreagted to the '1 copy' at the end of the statistical analysis once it has been performed on the '1 copy'. Filtered-out sequences include 'No J-GENE', 'No JUNCTION', 'Warnings', 'Unknown functionalities' and 'No results'.



          4. Terminology
            Same as above. This section helps users understand the general terminology of the statistical analysis report.

          5. Number of '1 copy' with 'single gene' and 'several genes' (for V, D and/or J) tables and histograms
            These tables and histograms show, for each locus, the number sequences for which IMGT/HighV-QUEST has found 'single gene' and 'several genes', respectively, either for each gene type, separately (V, D, J) or in combination ('V,D', 'D,J', 'V,J', 'V,D,J').




          6. '1 copy' with 'single gene' gene and allele tables and histograms
            The tables show the IMGT gene and allele name, the number of '1 copy' (Total), Sequence average length (in nb of nt), V-REGION average length (in nb of nt), 'id=100%' which represents the number of sequences with an identity percent of 100% by comparison with the germline.
            The colored lines (green: V, red: D, yellow: J) display the results per gene. For each gene, the results are then displayed per allele (white lines), with the indication of the functionality of the germline allele and for id=100%, and between parentheses, percent of these sequences by comparison to 'Total'.
            The functionality of an allele can be:
            • F: Functional
            • P: Pseudogene
            • ORF: Open Reading Frame
            The functionality is shown between parentheses, (F) and (P), when the corresponding germline gene has not yet been isolated.
            It is shown between brackets, [F] and [P], when it is not known if the sequence is germline or rearranged.





            Histograms display the number of '1 copy' with 'single gene' for each V, (D) and J genes. Genes are shown according to their position from 5' to 3' in the concerned locus. Unmapped genes are located at the top of the histograms.




          7. '1 copy' with 'several genes' gene and allele tables
            These tables have the same header and the same type of results as '6'.
            There are as many different lines as different results, as proposed by IMGT/HighV-QUEST, at the gene (colored lines) and/or allele level (white lines).
            These results are usually obtained for short sequences which do not allow the assignment by IMGT/HighV-QUEST to a single gene or single allele.




          8. Sequences in 'More than 1'
            Sequences in 'More than 1' (violet-blue lines) are shown below each corresponding '1 copy' (green lines). The 'More than 1' are excluded from the per se statistical analysis to avoid redundancy, the number of 'More than 1' being added to the corresponding '1 copy' ONLY at the end of statistical analysis.




          9. Other filtered-out sequences
            All other filtered-out sequences ('No J-GENE', 'No junction', 'Warnings', 'Unknown functionality', and 'No results') are provided in separate similar tables, with the 'Sequence number' and the 'Sequence ID'.




          4.2 IMGT clonotype (AA and nt) results per locus

          Overview IMGT clonotype (AA and nt) results per locus are provided in 10 sections (HTML pages):

          1. IMGT clonotypes (AA) per Nb
          2. IMGT clonotypes (AA) per Nb with detailed clonotypes (nt)
          3. IMGT clonotypes (AA) per V gene
          4. IMGT clonotypes (AA) per V gene with detailed clonotypes (nt)
          5. IMGT clonotypes (AA) per CDR3-IMGT length (AA)
          6. IMGT clonotypes (AA) per CDR3-IMGT length (AA) with detailed clonotypes (nt)
          7. IMGT clonotypes (AA) by CRD3-IMGT sequence (AA) alphabetical order with detailed clonotypes (nt)
          8. IMGT clonotype (AA) diversity and expression histograms: per V, (D), J-GENE and per CDR3-IMGT length
          9. IMGT clonotype (AA) diversity and expression tables: per V, (D), J-GENE and per CDR3-IMGT length
          10. V gene and allele table: Rearrangements, Nb of sequences and Nb IMGT clonotypes (AA) per V-GENE and allele


            4.2.1 IMGT clonotypes (AA) per Nb

            The Nb of IMGT clonotypes (AA) is given by the total number of ID (last line in column #).
            The 'IMGT clonotypes (AA) per Nb' provides for each IMGT clonotype (AA):

            • ID, nb (#) and experimental ID (Exp. ID).
            • Nb, Nb of '1 copy', Nb of 'More than 1' and Total (=nb of sequences assigned to that IMGT clonotype (AA)).
            • IMGT clonotype (AA) definition: V, D and J genes and alleles, CDR3-IMGT length (AA), CDR3-IMGT sequences (AA), Anchors 104, 118.
            • IMGT clonotype (AA) representative sequence: V % (percentage of identity of the V-REGION compared to the closest germline V gene), Sequence length, Functionnality (as identified by IMGT/HighV-QUEST), Sequence ID with link to the file of the representative sequence.
            • IMGT clonotypes (nt): Sequences file ('1 copy') with a link to each 'Sequences file' containing the '1 copy' sequences in FASTA format assigned to a given IMGT clonotype (AA).

            In this table, the results are sorted by decreasing nb of '1 copy' and then by decreasing nb of 'More than 1'.





            4.2.2 IMGT clonotypes (AA) per Nb with detailed clonotypes (nt)

            The Nb of IMGT clonotypes (AA) is given by the total number of ID (last line in column #).
            The same information, as in IMGT clonotypes (AA) per Nb, is provided for each IMGT clonotype (AA) and displayed in the pink line, with under each clonotype (AA), the corresponding IMGT clonotype(s) (nt) displayed on separate lines.

            The following information is provided:

            • ID nb (the same as that of the IMGT clonotype (AA)) (#),
            • CDR3-IMGT length (nt),
            • nb of different CDR3-IMGT (nt),

            and then, for each CDR3-IMGT (nt):

            • CDR3-IMGT sequence (nt),
            • nb of different nt in the CDR3 (compared to the sequence (nt) of the IMGT clonotype (AA)),
            • V gene and allele, D gene and allele, J gene and allele,
            • Anchors 104,118: 'C,F' or 'C,W',
            • V-REGION identity % mean, V-REGION length mean,
            • J-REGION identity % mean, J-REGION length mean,
            • Sequence length mean,
            • nb of '1 copy', nb of 'More than 1' and Total (=nb of sequences assigned to that IMGT clonotype (nt) (the 3 columns on the right)). For a given IMGT clonotype AA the sum of '1 copy', 'More than 1', and Total of the IMGT clonotypes (nt) is equal respectively to those of the IMGT clonotype (AA) (boxes 3, 4, 5 of the pink line).





            4.2.3 IMGT clonotypes (AA) per V gene

            The same information, as IMGT clonotypes (AA) per Nb, is provided but sorted here alphabetically by V gene and allele name.





            4.2.4 IMGT clonotypes (AA) per V gene with detailed clonotypes (nt)

            The same information, as IMGT clonotypes (AA) per Nb with detailed clonotypes (nt), is provided but sorted here alphabetically by V gene and allele name.
            For a given IMGT clonotype AA, the sum of '1 copy', 'More than 1', and Total of the IMGT clonotypes (nt) is equal respectively to the pink boxes before sequence file.





            4.2.5 IMGT clonotypes (AA) per CDR3-IMGT length (AA)

            The same information, as IMGT clonotypes (AA) per Nb, is provided but sorted here by CDR3-IMGT length.
            This display allows identification of sequences assigned to different IMGT clonotypes (AA) whereas, most probably, they represent a single IMGT clonotype (AA).





            4.2.6 IMGT clonotypes (AA) per CDR3-IMGT length (AA) with detailed clonotypes (nt)

            The same information, as IMGT Clonotypes (AA) per Nb, is provided with detailed clonotypes (nt) sorted by CDR3-IMGT length (AA) and, under a same length, by CDR3-IMGT sequence (AA) alphabetical order.





            4.2.7 IMGT clonotypes (AA) by CRD3-IMGT sequence (AA) alphabetical order with detailed clonotypes (nt)

            The same information, as IMGT clonotypes (AA) per Nb with detailed clonotypes (nt), is provided sorted by CRD3-IMGT sequence (AA) alphabetical order following a decreasing CDR3-IMGT length. The format of the table is the same as in 6. IMGT Clonotypes (AA) per CDR3-IMGT length (AA) with detailed clonotypes (nt).





            4.2.8 IMGT clonotype (AA) diversity and expression histograms: per V, (D), J-GENE and per CDR3-IMGT length

            - IMGT clonotype (AA) diversity histograms: Nb of IMGT clonotype (AA) per V-GENE (green color), D-GENE (for IGH, TRB, TRD) (red color) and J-GENE (yellow color) and per CDR3-IMGT length.
            - IMGT clonotype (AA) expression histograms: Nb of sequences assigned to an IMGT clonotype (AA) per V-GENE, D-GENE (for IGH, TRB, TRD) and J-GENE (pink color).
            - IMGT clonotype (AA) histograms per CDR3-IMGT length.



            4.2.9 IMGT clonotype (AA) diversity and expression tables: per V, (D), J-GENE and per CDR3-IMGT length

            - IMGT clonotype (AA) diversity table: Nb of IMGT clonotypes (AA) per V-GENE, D-GENE (for IGH, TRB, TRD), J-GENE.
            - IMGT clonotype (AA) expression table: Nb of sequences assigned to an IMGT clonotype (AA) per V-GENE, D-GENE (for IGH, TRB, TRD), J-GENE.
            - IMGT clonotypes (AA) table per CDR3-IMGT length.



            4.2.10 V gene and allele table rearrangements

            The table shows for each V gene name and V allele name:

            • Nb of sequences assigned to an IMGT clonotype (AA),
            • Nb of different IMGT clonotypes (AA),
            • Nb of out-of-frame sequences,
            • Nb of other categories sequences.

            Clicking on the red and yellow squares, in the "V gene name" and "V allele name" columns, gives access to the D and J genes and alleles, respectively, involved in the rearrangements of a given V gene or allele.





    5. IMGT clonotype (AA) results comparison

      They include the following results per locus:

      5.1 IMGT clonotypes (AA) comparison: Full results

      The same presntation, as 'IMGT clonotypes (AA) per CDR3-IMGT length (AA)' is provided but sorted here by IMGT clonotypes (AA) present in a single batch ("Present in MID1", lightblue and lightsteel blue lines) and IMGT clonotypes (AA) common to 2 (or more) batches ("Present in MID1 and MID2", lightpink and light yellow lines).
      The information is paginated by page size and by different batch combination and their common IMGT clonotypes (AA).





      5.2 IMGT clonotypes (AA) comparison: Synthesis table

      The Synthesis table indicates the number of IMGT clonotypes (AA) (diversity) and the number of sequences assigned to IMGT clonotypes (AA) (expression) only present ('exclusive') in a single batch or common to two or more batches.
      There is one line for each single batch (Nb of batches: 1) and for each combination of batches (Nb of batches: 2 or >2).





      5.3 IMGT clonotypes (AA) comparison: Result summary table per V-GENE, D-GENE (for IGH, TRB, TRD), J-GENE

      The table recapitulates per gene of each batch:

      • 'Nb of IMGT clonotypes (AA)'
      • 'Nb of in-frame sequences assigned to IMGT clonotypes (AA)'.

      The order of genes is the same as in the locus, with unmapped genes in the first lines.





    6. 'data' directory

      The 'data' directory contains the 'stats_xxx' (.txt format) file(s) where 'xxx' is the batch name and the locus type.

      At least two 'stats_xxx' files are needed to launch a comparative analysis in IMGT/StatClonotype tool.

      This file contains 26 columns (see Table Content of the stats_xxx file below).









This work was granted access to the HPC resources of CINES under the allocation 2014-036029 made by GENCI (Grand Equipement National de Calcul Intensif).



References:
[1] Alamyar, E. et al., Immunome Res. 8:1:2. (2012) doi:10.4172/1745-7580.1000056. LIGM:400
[2] Alamyar, E. et al., IMGT/HighV-QUEST: A High-Throughput System and Web Portal for the Analysis of Rearranged Nucleotide Sequences of Antigen Receptors, JOBIM2010, Paper 63 (2010).
[3] Alamyar E., et al., Methods Mol. Biol. 882:569-604 (2012). PMID:22665256 LIGM:404
[4] Li. S et al., Nat. Commun. 4:2333(2013) doi: 10.1038/ncomms3333 Open access. PMID:23995877 LIGM:419
[5] Giudicelli V., et al., Autoimmun Infec Dis. 1(1) (2015). doi: 10.16966/aidoa.103. Free Article LIGM:448
[6] M-P. Lefranc, et al. IMGT®, the international ImMunoGeneTics information system®. Nucleic Acids Res., 37, 1006-1012 (2015). PMID:18978023 LIGM:349
[7] Lefranc M-P. and Lefranc G. The Immunoglobulin FactsBook (2001), Academic Press, London, UK (458 pages)
[8] Lefranc M-P. and Lefranc G. The T cell receptor FactsBook (2001), Academic Press, London, UK (398 pages)
[9] Giudicelli V. and al. Nucl. Acids Res., 33, D256-D261 (2005). PMID:15608191 LIGM:292
[10] Giudicelli V. and Lefranc M-P. Front. Genet. 3:79 (2012). Online access. PMID: 22654892 LIGM:401
[11] Lefranc, M-P. Front Immunol. 5:22 (2014). doi: 10.3389/fimmu.2014.00022. Online access. PMID: 24600447 LIGM:429
[12] Giudicelli, V. et al. Nucl. Acids Res. 32, W435-440 (2004). PMID: 15215425 LIGM:287
[13] Brochet, X. et al. Nucl. Acids Res. 36, W503-508 (2008). PMID:18503082 LIGM:344
[14] Pommié, C. et al. J. Mol. Recognit., 17, 17-32 (2004) PMID:14872534 LIGM:284
[15] Lefranc, M.-P. et al. Dev. Comp. Immunol., 27, 55-77 (2003). PMID:12477501 with permission from Elsevier
Created:
Monday, 19-Oct-2009
Last updated:
Authors:
Eltaf Alamyar, Arthur Lavoie, Patrice Duroux, Véronique Giudicelli and Marie-Paule Lefranc
Technical Support:
Arthur Lavoie
Contact:
Marie-Paule Lefranc