660 likes | 884 Views
An Introduction to Artificial Immune Systems. ES2001 Cambridge. December 2001. Dr. Jonathan Timmis Computing Laboratory University of Kent at Canterbury CT2 7NF. UK. J.Timmis@ukc.ac.uk http:/www.cs.ukc.ac.uk/people/staff/jt6. Overview of Tutorial. What are we going to do?: First Half:
E N D
An Introduction to Artificial Immune Systems ES2001 Cambridge. December 2001. Dr. Jonathan Timmis Computing Laboratory University of Kent at Canterbury CT2 7NF. UK. J.Timmis@ukc.ac.uk http:/www.cs.ukc.ac.uk/people/staff/jt6
Overview of Tutorial • What are we going to do?: • First Half: • Describe what is an AIS • Why bother with the immune system? • Be familiar with relevant immunology • Second Half: • Appreciation of were AIS are used • Be familiar with the building blocks of AIS • Resources
Immune metaphors Other areas Idea! Idea ‘ Artificial Immune Systems Immune System
Why the Immune System? • Recognition • Anomaly detection • Noise tolerance • Robustness • Feature extraction • Diversity • Reinforcement learning • Memory • Distributed • Multi-layered • Adaptive
Artificial Immune Systems • AIS are computational systems inspired by theoretical immunology and observed immune functions, principles and models, which are applied to complex problem domains (de Castro & Timmis, 2001)
Some History • Developed from the field of theoretical immunology in the mid 1980’s. • Suggested we ‘might look’ at the IS • 1990 – Bersini first use of immune algos to solve problems • Forrest et al – Computer Security mid 1990’s • Hunt et al, mid 1990’s – Machine learning
Scope of AIS • Fault and anomaly detection • Data Mining (machine learning, Pattern recognition) • Agent based systems • Scheduling • Autonomous control • Optimisation • Robotics • Security of information systems
Role of the Immune System • Protect our bodies from infection • Primary immune response • Launch a response to invading pathogens • Secondary immune response • Remember past encounters • Faster response the second time around
Immune Pattern Recognition • The immune recognition is based on the complementarity between the binding region of the receptor and a portion of the antigen called epitope. • Antibodies present a single type of receptor, antigens might present several epitopes. • This means that different antibodies can recognize a single antigen
Antibodies Antibody Molecule Antibody Production
Main Properties of Clonal Selection (Burnet, 1978) • Elimination of self antigens • Proliferation and differentiation on contact of mature lymphocytes with antigen • Restriction of one pattern to one differentiated cell and retention of that pattern by clonal descendants; • Generation of new random genetic changes, subsequently expressed as diverse antibody patterns by a form of accelerated somatic mutation
T-cells • Regulation of other cells • Active in the immune response • Helper T-cells • Killer T-cells
Reinforcement Learning and Immune Memory • Repeated exposure to an antigen throughout a lifetime • Primary, secondary immune responses • Remembers encounters • No need to start from scratch • Memory cells • Associative memory
Immune Network Theory • Idiotypic network (Jerne, 1974) • B cells co-stimulate each other • Treat each other a bit like antigens • Creates an immunological memory
Shape Space Formalism • Repertoire of the immune system is complete (Perelson, 1989) • Extensive regions of complementarity • Some threshold of recognition V ´ V e e V e e ´ ´ ´ ´ V e e ´ ´
Self/Non-Self Recognition • Immune system needs to be able to differentiate between self and non-self cells • Antigenic encounters may result in cell death, therefore • Some kind of positive selection • Some element of negative selection
Summary so far …. • Immune system has some remarkable properties • Pattern recognition • Learning • Memory • So, is it useful?
This Section • General Framework for describing and constructing AIS • A short review of where AIS are used today • Can not cover them all, far too many • I am not an expert in all areas (earn more money if I was) • Where are AIS headed?
What do want from a Framework? • In a computational world we work with representations and processes. Therefore, we need: • To be able to describe immune system components • Be able to describe their interactions • Quite high level abstractions • Capture general purpose processes that can be applied to various areas
AIS Framework • De Castro & Timmis, 2002 • Immune Representations • Immune Algorithms • Guidelines for developing AIS
Representation – Shape Space • Describe the general shape of a molecule • Describe interactions between molecules • Degree of binding between molecules • Complement threshold
Representation • Vectors Ab = Ab1, Ab2, ..., AbL Ag = Ag1, Ag2, ..., AgL • Real-valued shape-space • Integer shape-space • Hamming shape-space • Symbolic shape-space
Define their Interaction • Define the term Affinity • Affinity is related to distance • Euclidian • Other distance measures such as Hamming, Manhattan etc. etc. • Affinity Threshold
Basic Immune Models and Algorithms • Bone Marrow Models • Negative Selection Algorithms • Clonal Selection Algorithm • Somatic Hypermutation • Immune Network Models
Bone Marrow Models • Gene libraries are used to create antibodies from the bone marrow • Antibody production through a random concatenation from gene libraries • Simple or complex libraries
Negative Selection Algorithms • Forrest 1994: Idea taken from the negative selection of T-cells in the thymus • Applied initially to computer security • Split into two parts: • Censoring • Monitoring
Negative Selection Algorithm • Each copy of the algorithm is unique, so that each protected location is provided with a unique set of detectors • Detection is probabilistic, as a consequence of using different sets of detectors to protect each entity • A robust system should detect any foreign activity rather than looking for specific known patterns of intrusion. • No prior knowledge of anomaly (non-self) is required • The size of the detector set does not necessarily increase with the number of strings being protected • The detection probability increases exponentially with the number of independent detection algorithms • There is an exponential cost to generate detectors with relation to the number of strings being protected (self). • Solution to the above in D’haeseleer et al. (1996)
Clonal Selection Algorithm • de Castro & von Zuben, 2001 Randomly initialise a population (P) For each pattern in Ag Determine affinity to each P’ Select n highest affinity from P Clone and mutate prop. to affinity with Ag Add new mutants to P endFor Select highest affinity P to form part of M Replace n number of random new ones Until stopping criteria
Immune Network Models • Timmis & Neal, 2000 • Used immune network theory as a basis, proposed the AINE algorithm Initialize AIN For each antigen Present antigen to each ARB in the AIN Calculate ARB stimulation level Allocate B cells to ARBs, based on stimulation level Remove weakest ARBs (ones that do not hold any B cells) If termination condition met exit else Clone and mutate remaining ARBs Integrate new ARBs into AIN
Immune Network Models • De Castro & Von Zuben (2000c) • aiNET, based in similar principles At each iteration step do For each antigen do Determine affinity to all network cells Select n highest affinity network cells Clone these n selected cells Increase the affinity of the cells to antigen by reducing the distance between them (greedy search) Calculate improved affinity of these n cells Re-select a number of improved cells and place into matrix M Remove cells from M whose affinity is below a set threshold Calculate cell-cell affinity within the network Remove cells from network whose affinity is below a certain threshold Concatenate original network and M to form new network Determine whole network inter-cell affinities and remove all those below the set threshold Replace r% of worst individuals by novel randomly generated ones Test stopping criterion
Somatic Hypermutation • Mutation rate in proportion to affinity • Very controlled mutation in the natural immune system • Trade-off between the normalized antibody affinity D* and its mutation rate ,
Anomaly Detection • The normal behavior of a system is often characterized by a series of observations over time. • The problem of detecting novelties, or anomalies, can be viewed as finding deviations of a characteristic property in the system. • For computer scientists, the identification of computational viruses and network intrusions is considered one of the most important anomaly detection tasks
Virus Detection • Protect the computer from unwanted viruses • Initial work by Kephart 1994 • More of a computer immune system
Virus Detection (2) • Okamoto & Ishida (1999a,b) proposed a distributed approach • Detected viruses by matching self-information • first few bytes of the head of a file • the file size and path, etc. • against the current host files. • Viruses were neutralized by overwriting the self-information on the infected files • Recovering was attained by copying the same file from other uninfected hosts through the computer network
Immune System Computational System Pathogens (antigens) Computer viruses B-, T-cells and antibodies Detectors Proteins Strings Antibody/antigen binding Pattern matching Virus Detection (3) • Other key works include: • A distributed self adaptive architecture for a computer virus immune system (Lamont, 200) • Use a set of co-operating agents to detect non-self patterns
Security • Somayaji et al. (1997) outlined mappings between IS and computer systems • A security systems need • Confidentiality • Integrity • Availability • Accountability • Correctness
Immune System Network Environment Static Data Self Uncorrupted data Non-self Any change to self Active Processes on Single Host Cell Active process in a computer Multicellular organism Computer running multiple processes Population of organisms Set of networked computers Skin and innate immunity Security mechanisms, like passwords, groups, file permissions, etc. Adaptive immunity Lymphocyte process able to query other processes to seek for abnormal behaviors Autoimmune response False alarm Self Normal behavior Non-self Abnormal behavior Network of Mutually Trusting Computers Organ in an animal Each computer in a network environment IS to Security Systems
Network Security • Hofmeyr & Forrest (1999, 2000): developing an artificial immune system that is distributed, robust, dynamic, diverse and adaptive, with applications to computer network security. • Kim & Bentley (2001). Hybrid approach of clonal selection and negative selection.
Forrests Model External host Randomly created Host ip: 20.20.15.7 010011100010.....001101 Activation port: 22 Detector threshold set Immature Datapath triple Cytokine level match during No (20.20.15.7, 31.14.22.87, Internal tolerization host ftp) Permutation mask Exceed Mature & Naive activation ip: 31.14.22.87 threshold Match port: 2000 during Don’t Match tolerization exceed Detector Activated activation threshold 0100111010101000110......101010010 No Co stimulation co stimulation memory immature activated matches Death Memory Broadcast LAN AIS for computer network security. (a) Architecture. (b) Life cycle of a detector.
Novelty Detection • Image Segmentation : McCoy & Devarajan (1997) • Detecting road contours in aerial images • Used a negative selection algorithm