Welcome to IMGT Knowledge Graph (IMGT-KG): a knowledge graph to integrate immunogenetics data

Considering the connected nature of immunogenetics entities from IMGT® sequence databases to the monoclonal antibodies database, a need for integration of immunogenetics data arises. To cover this need, we built the IMGT Knowledge graph (IMGT-KG), the first knowledge graph in the immunogenetics field. It bridges the gap between nucleotide and protein sequences of IMGT® databases and will open the way for effective queries and integrative immuno-omics analyses. IMGT-KG acquires data from IMGT®, then represents, describes and structures immunogenetics entities and their interrelationships in a knowledge graph using semantic web standards and technologies.

IMGT-KG is built on top of an extended version of IMGT-ONTOLOGY. We prioritize reuse of existing terms in our knowledge graph as recommended by semantic web good practices. Many of these terms are from Open Biological and Biomedical Ontology (OBO) Foundry ontologies (including Thesauris). IMGT-KG uses a set of rules to guide inferences on the positions of nucleotide sequences applying Allen Interval Algebra. IMGT-KG aims to describe immunogenetics entities from nucleotide level to the protein level. In addition, IMGT-KG provides external links to other resources including Protein Data Bank (PDB) , Immune Epitope Database (IEDB) and PUBMED articles.

We make openly and freely available IMGT-KG powered by YASGUI at this link for the normal graph without inferences. In addition We provide another way to access IMGT-KG including normal and infered graph at this link by the means of Apache Jena Fuseki2 .

Knowledge graphs are emerging as one of the most popular means for data federation, transformation, integration and sharing, promising to improve data visibility and reusability. Immunogenetics is the branch of life sciences that studies the genetics of the immune system. Although the complexity and the connected nature of immunogenetics data make knowledge graphs a prominent choice to represent and describe immunogenetics entities and relations, hence enabling a plethora of applications, little effort has been directed towards building and using such knowledge graphs so far. In this work, we present the IMGT Knowledge Graph (IMGT-KG), the first of its kind FAIR knowledge graph in immunogenetics. IMGT-KG acquires and integrates data from different immunogenetics databases, hence creating links between them. Consequently, IMGT-KG provides access to 79 670 110 triplets with 10 430 268 entities, 673 concepts and 173 properties. IMGT-KG reuses many existing terms from domain ontologies or vocabularies and provides external links to other resources of the same domain, as well as a set of rules to guide inference on nucleotide sequence positions by applying Allen Interval Algebra. Such inference allows, for example, reasoning about genomics sequence positions. IMGT-KG fills in the gap between genomics and protein sequences and opens a perspective to effective queries and integrative immuno-omics analyses. We make openly and freely available IMGT-KG with detailed documentation and a Web interface for access and exploration.

Sanou, G., Giudicelli, V., Abdollahi, N., Kossida, S., Todorov, K., Duroux, P. (2022). IMGT-KG: A Knowledge Graph for Immunogenetics. In: , et al. The Semantic Web – ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol 13489. Springer, Cham.