520 likes | 680 Views
Computational Immunology and Immunological Computation March 2001 Giuseppe Nicosia Department of Mathematics and Computer Science University of Catania, Italy nicosia@dmi.unict.it www.dmi.unict.it/nicosia. Talk Outline. 1 Tour of Immunology. 2. Computational Immunology.
E N D
Computational Immunology and Immunological Computation March 2001 Giuseppe Nicosia Department of Mathematics and Computer Science University of Catania, Italy nicosia@dmi.unict.it www.dmi.unict.it/nicosia
Talk Outline 1 Tour of Immunology. 2. Computational Immunology. 3. Immunological Computation: what are artificial immune systems? 4. Future directions.
Part ITour of Immunology “The Immune System is a complex adaptive system of cells and molecules, distributed throughout our body, that provide us with a basic defense against pathogenic organisms”
1. A gentle introduction to the Immune System What problem is it that the IS solves? The IS uses distributed detection to solve the problem of distinguishing between self and nonself, which are elements of the body, and foreign elements respectively (actually, the success of the IS is more dependent on its ability to distinguish between harmful nonself, and everything else). Why is this a hard problem? • because there are so many patterns innonself, on the order of 1016, that have to be distinguished from 106 self patterns; • because the environment is highly distributed; • because the body must continue to function all the time; • because resources are scarce.
How does the IS solve this problem? Most elementary is the skin, which is the first barrier to infection. Another barrier is physiological where conditions such as pH and temperature provide inappropriate living conditions for foreign organisms. Once pathogens have entered the body, they are dealt with by the innate IS and by the acquired immune response system. IS defenses are multi-layered
Lymphocytes • The acquired immune response consists of certain types of white blood cells, called lymphocytes, that cooperate to detect pathogens and assist in the destruction of those pathogens. These lymphocytes can be thought of as detectors which are tiny compared to the body. • Detection and elimination of pathogens is a consequence of millions of small cells - detectors - interacting through simple, very localised rules, to give rise to a truly efficient distributed system.
2. How do lymphocytes detect pathogens? Pathogens are detected when a molecular bond is established between the pathogen and receptors that cover the surface of the lymphocyte. Because of the large size and complexity of most pathogens, only parts of the pathogen, discrete sites called epitopes, get bound to the lymphocyte receptors. Lymphoctyes recognize pathogens by binding to them.
Some Definitions Similarity Subset.A lymphocyte has approximately 105 receptors on its surface, but because all of these receptors have the same structure (a lymphocyte is monoclonal), a single lymphocyte can only bind to structurally related epitopes. These structurally related epitopes define the similarity subset that the lymphocyte detects. Affinity. The number of receptors that bind to pathogens will determine the affinity that the lymphocyte has for a given pathogen. If a bond is very likely to occur, then many receptors will bind to pathogen epitopes, resulting in a high affinity for that pathogen; if a bond is unlikely to occur, then few receptors will bind to epitopes, and the lymphocyte will have a low affinity for that pathogen. Affinity threshold. Lymphocytes can only be activated by a pathogen if the lymphocyte's affinity for the pathogen exceeds a certain affinity threshold.
3. Generating Receptor Diversity • Because detection is carried out by binding with non-self, the IS must have a sufficient diversity of lymphocyte receptors to ensure that at least some lymphocytes can bind to any given pathogen. • Generating sufficient diversity is a problem because the human body only makes on the order of 106different proteins, which the IS must use to construct receptors that can recognize potentially 1016different proteins or patterns.
How does the IS produce the required diversity of receptors? One source of this diversity: Lymphocyte receptors are constructed from inherited gene segments or libraries. The receptors are made by randomly recombining elements from different libraries, resulting in an exponential number of possible combinations, and hence a huge diversity of receptor structures. This combinatorial explosion allows the IS to make potentially1015 different kinds of receptors, although the actual number of distinct receptors present in the body at any given time is far less than this. Estimates place the number at between 108 and 1012 distinct receptors present at one instance.
Receptor diversity is generated by randomly recombining parts of inherited gene segments. This process allow each B cell to create an antibody receptor by choosing among a number of interchangeable parts that are present in the DNA.
Somatic Mutation • The other source of diversity, which operates after a mature B cell has been stimulated by an antigen, is somatic mutation: “a B cell can introduce point mutations into the genes that code for its previously chosen antibody receptor”. • With somatic mutation, the possible number of Ab receptors may be as high as 1016.
If the diversity of receptors present is several orders of magnitude less than the diversity of pathogen patterns, how does the IS detect most pathogens? • By using approximate binding: each monoclonal lymphocyte can bind to several variants of a pathogen, i.e. it can recognize a similarity subset of patterns. • By being very dynamic: lymphocytes are very short-lived (2 to 3 days) and are constantly replaced (100 million new lymphocytes are generated every day).
4. Affinity Maturation To cover the space of all possible non-self patterns adequately requires a huge diversity of lymphocyte receptors, with relatively low affinity thresholds. This enables the IS to detect just about any pathogen, but the detection may take some time, during which the pathogen will be replicating and causing harm. The IS needs to be able to both detect and eliminate pathogens as quickly as possible. Generalized lymphocytes will not be very fast at detecting specific pathogens, nor very efficient at capturing them.
How does the IS learn to recognize specific pathogens? • Through a process called affinity maturation, which is essentially a Darwinian process of variation and selection. Affinity maturation involves a subset of lymphocytes, B-cells. • When a B-cell is activated (its affinity threshold is exceeded), by binding to pathogens, it does two things. • Firstly, it secretes a soluble form of its receptors, called antibodies, which bind to pathogens and inactivate them, or identify them to phagocytes and other innate system defenses, which allows the innate system to eliminate them. • Secondly, the B-cell clones itself, but the copies producted by this cloning are not perfect. Cloning is subject to very high mutation rates, called somatic hypermutation, which can result in daughter cells that have somewhat different receptors from the parent.
Activated B-cells produce antibodies and mutated clones, which are subject to selection via pathogen affinities.
The affinity loop These new B-cells will also have the opportunity to bind to pathogens, and if they have a high affinity for the pathogens, they in turn will be activated and cloned. The higher the affinity of a B-cell for pathogens present, the more likely it is that the B-cell will clone; B-cells end up competing for available pathogens, with the highest affinity B-cells being the "fittest" and hence replicating the most. Thus the variation is provided by hypermutation, and the selection is provided by competition for pathogens: Clonal selection.
5. Immune Memory A successful primary immune response results in a proliferation of B-cells that have high affinities for the pathogens that caused the response. Typically these B-cells are so short-lived, that once the pathogen is eliminated, we would expect these B-cells to die out, which means that if the pathogen were encountered again, the IS would have to go through the whole affinity maturation process again. The IS avoids this by somehow retaining a memory of the information encoded in these adapted B-cells:adapted B-cells are "memory cells" that are very long-lived;
Primary responses to new pathogens are slow; memory of previously seen pathogens allows the IS to mount much faster secondary responses
Part II Computational Immunology • To use information technology to advance the study of immunology. The main research interests are at the overlap between immunology and computer science. • By combining these two domains one seek to identify potential targets for vaccine and immunotherapeutic drug design. • By simulating laboratory experiments, we seek to help find answers to some of (many) open questions about the IS.
The problem – deciphering the human immune system • large combinatorial space; • degeneracy of interactions; • difficulty performing in-vivo experiments.
Cellular Automata based models of the IS Large complex automata: • Model incorporates a lot of immunological details; • Allow different investigations using the same model (it is relatively easy to compare results – integrate hypothesis); • More suitable as real in machina experimental tool; • Could be a good working tool for biologists; • There are several advantages in a CA based model of the IS: - it represents the entities and processes of interest in biological terms, - it is easy to modify the complexity of the interactions without introducing any new qualitative difficulties in solving the model.
The Celada-Seiden model • Among the CA based models, the Celada-Seiden (CS) automaton (1992) is one of the most prominent as to the biological fidelity. • The CS-model concentrates on the process ofclonal selection during thehumoral immune response.
The CS automaton 1/2 The CS can be defined as aextended lattice gas: • the dynamics is probabilistic; • entities move from site to site (diffusion process); • the number of entities is not strictly constant; • the CS represents a singlelymph node; • the CA is defined on a triangular 2-d lattice with periodic boundary conditions (toroidal geometry), Each site has just six edges nearest neighbours.
The CS automaton 2/2 • Each time step corresponds to8 hours of “real life”; • there is an “external environment” representing thebone marrowand thethymus. These components create new virgin cells during the simulation but they are NOT part of the CA; • the entities defined on the sites arecells,moleculesandsignal molecules.
Cellular entities • (B) Lymphocyte B • (Th) Lymphocyte T Helper • (PLB) Lymphocyte Plasma B cell • (APC) Macrophage or generic antigen processing cell. Molecular entities • (Ag) Antigen (bacteria, parasites, and viruses) • (Ab) Antibody • (IC) Immuno complex or Ab-Ag binding. Signal • (IL2) Interleukin (implicitly modeled)
Bit String • Each entity has r(e)receptors and s(e)states. • Cell receptors and molecules are modelled as bit strings of length L. • The binding between two strings occurs with a certain probability which is function of the match m computed as the Hamming distance of the two bit-string.
Matching Bits • In this figure an example of perfect matching corresponding to zero Hamming distance.
Affinity function The probability to interact, affinity function, is a simple truncated exponential function with thresholdmc, (L/2< mc<L), V(m)=(L-m)/(L-mc) for m mc V(m)=0 for m<mc • In the case mc=L we set V(L)=1. • The parameter (0,1) determines the sharpness of the affinity.
Interactions Internal interactions • B – MHCII • APC – MHCII External interactions • B – Ag • B – Th • Ab – Ag • Th – APC • APC – IC • APC – Ag The MHC molecules containing the inserted antigenic peptides (MHC-peptide complexes) are then transported to the surface of the cell.
The main loop of the ARIMSY program For (simTime=0; simTime<200; simTime++) { birth(); /* B cell proliferation */ produceAb(); /* PLB antibody production */ iB_Ag(); /* B cell + Ag interaction */ iAb_Ag(); /* Ab + Ag interaction */ death(); /* Cell death */ flow(); /* Flow of naive B cells from bone marrow */ diffusion(); /* Diffusion within the body lattice*/ injectAg(); /* Check for scheduled injection of Ag */ }
Immunization– Cells population dynamics: Ag (up-left),B (up-right) and PLB (down-left). Due to the memory of the system, the Ag is removed in decreasing time from the first to the second injection.
Immunization– Cells population dynamics: T (up-left), IC (up-right) and Ab (down-left).
A Network of CA for the simulation of the IS(G. Nicosia et al., Int. J. of Modern Physics C, Vol. 10 No. 4, 677-686, 1999) • In each point of the CA-node one have only one cell and a certain maximum number nmaxof molecules.
Some works with the CS-model 1/2 • B. Kohler (NYU) performed a systematic study in IMMSIM indicating that depending on the characteristics (speed of growth, size of burst, infectivity) of the virus, the vaccine to be successful should sometimes “correct” the balance between humoral and cellular responses. • S. Kleinstein (Princeton) simulated the dynamics of T cells and B cell hypermutation in Germinal centres.
Some work with the CS-model 2/2 • F. Castiglione (University of Cologne) studied the physical properties of humoral immune response (Phys. Rev. E, Vol. 61, issue 2, 2000; Phys. Rev. Lett., Vol. 79, No. 22, 1997). • At the University of Catania we developed a different automaton approach which shows to reproduce the same behaviour for what concernes the humoral response. Moreover they studied the pattern recognition ability of the IS (G. Nicosia et al., Theory in Bioscience, to appear 2001).
Part II, future directions • To improve visualisation. • To insert CD5 and CD72 receptors • to implement the apoptosis process. • Final goal of the simulations is to carry on • realistic “in machina” experiments.
Part III Immune System metaphor for machine learning: Patterns Recognition with an Artificial Immune System
Why are computer scientists interested in the immune system? It is a unique and fascinating computational system that has evolved to solve a unique problem. We hope that a study of the IS can suggest new solutions to computer science problems, or at least give us new ways of looking at these problems. Some of the properties of the IS that might be of interest to a computer scientist are: • Uniqueness: the IS of each individual is unique and therefore vulnerabilities differ from one system to the next. • Distributed detection: the detectors used by the IS are small and efficient, are highly distributed, and are not subject to centralized control or coordination.
Imperfect detection: by not requiring absolute detection of every pathogen, the IS is more flexible. • Anomaly detection: the IS can detect and react to pathogens that the body has never before encountered. • Learning and memory (adaptability): the IS can learn the structures of pathogens, and remember those structures, so that future responses to the pathogens can be much faster. These properties result in a system that is: scalable, resilient to subversion, robust, very flexible, and that degrades gracefully.
Immunological Computation • From an information-processing perspective, the IS is a remarkable parallel and distributed adaptive system. • It uses learning, memory, and associative retrieval to solve recognition and classification tasks. • Also, the overall behaviour of the system is an emergent property of many local interactions. These remarkable information-processing abilities of the IS provide several important aspects in the field of computation.
The Artificial Immune Systems The Immunological Computation makes use of immunological concepts in order to creates tools and algorithms (the Artificial Immune Systems) for solving machine-learning problems. Other biologically inspired systems: 1. Artificial neural networks; 2. Evolutionary computation; 3. Ants, PSO, DNA computation.
Applications AIS is a new computational paradigm which is used in different areas of science and technology ( see D. Dasgupta, 1998, Springer). In between the various application we remind: - data analysis (Very Important Application); - control autonomous mobile robot; - intelligent control applied to aircraft control; - commercial antivirus product (IBM antivirus); - security (V.I.A.); - anomaly detection (V.I.A); - function optimization.
Metaphors employed • B cells, Antigens; • Clonal selection; • Somatic hypermutation; • Adaptation; • Primary and secondary immune response.
The AIS algorithm(G. Nicosia et al., ”, accepted to Natural and Artificial Intelligent Systems, ENAIS'2001, ICSC Academic Press, March 17-21, 2001, Dubai) Input(patterns, mc) init(B population) for(t=0; t<steps; t++) { interact() /* cells interact */ proliferate() /* clone expansion*/ if (pm != 0) { hypermutation() /* mutation process*/ } age() /* death*/ bone-marrow() /* new B cells*/ diffuse() /* diffusion of patterns e B cells */ } Output(B population)
Learning multiple patterns Lower plot: the antigen population (learning phase). Upper plot: the population of the recognizing clones of B cells (recall phase). All the pattern are recognized.
Recognition Capacity Recognition rate for different initial populations and number of different patterns presented.
Part III, conclusions • The AIS is able to recognize multiple presented patterns and to cover a large repertoire thanks to the mutation process and the clonal selection. • The threshold parameter mc can be used to tune the tolerance in the recognition process. • The bottleneck seems to be the initial number of cells represented in the system.