220 likes | 338 Views
Measurement. Measurement. Recap: emergence and self- organisation are associated with complexity Can we identify systems that are complex? Can we distinguish types of levels of complexity? Intuitively, complexity is related to the amount of information needed to describe a system
E N D
Measurement • Recap: emergence and self-organisation are associated with complexity • Can we identify systems that are complex? • Can we distinguish types of levels of complexity? • Intuitively, complexity is related to the amount of information needed to describe a system • Need to measure information • entropy, information theoretic measures, …
Useful concepts: logarithms • definition • properties • changing base
Useful concepts: probability (1) • finite set X of size N, with elements xi • also covers infinite sets (eg integers), but gets trickier • p(x) = probability that an element drawn at random from X will have value x • coin: X = { H, T }; N = 2; p(H) = p(T) = 1/2 • die: X = { 1, 2, 3, 4, 5, 6 }; N = 6; p(n) = 1/6 • uniform probability: p(x) = 1 / N • constraints on the function p • average value of function f (x) :
Useful concepts: probability (2) 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 10 5 6 7 8 9 10 11 6 7 8 9 12 10 11 • example: sum of the values of 2 dice: • X = { 2, …, 12 }; N = 11
Information: Shannon entropy H • Shannon was concerned about extracting information from noisy channels • Shannon’s entropy measure relates to the number of bits needed to encode elements of X • For a system with unbiased probability, this evaluates to: • Entropy trivially increases with the number of possible states entropy • H = the amount of “information” • Matlab code • where p is a Matlab vector of the probabilities H = - dot(p,log2(p)); C. E. Shannon. A Mathematical Theory of Communication. Bell System Technical Journal 27, 1948
Shannon entropy : examples • coin: N = 2, H = log2 2 = 1 • You can express the information conveyed by tossing a coin in one bit • die: N = 6, H = log2 6 2.585 : < three bits • note that H does not have to be an integer number of bits • 2 dice: N = 11 • for uniform probabilities : log2 11 3.459 • can devise an encoding (“compression”) that uses fewer bits for the more likely occurrences, more bits for the unlikely ones: H 3.274
Choosing how to measure entropy • Any probability distribution can be used to calculate entropy • Some give more meaningful answers than others • Consider the entropy of Cas • Look first at entropy of single sites, average entropy over sites, and then at tiles
CA states : average (1) • entropy of a “tile” of N sites, k states per site • entropy of single site i • pi(s) = probability that site i is in state s • use average, over all N sites, if tiles of different sizes are to be compared • e.g. random: H = 1 (for k = 2), independent of N • this average entropy is measured in “bits per site”
CA states : average (2) • (a) random : each site randomly on or off • pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site • (b) semi-random : upper sites oscillating, lower random • pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site • (c) oscillating • pi(s) = ½ ; Hi = 1 ; H = 1 : 1 bit per site • so, not measure of any structure
CA states : tile (1) • whole tile entropy • k states per site, kN states per tile • p(s) = probability that the whole tile is in state s • divide by N to get entropy in “bits per site” • (a) random : each site randomly on or off • p(s) = 1/16 ; H = 1 : 1 bit per site, or 4 bits for the tile N. H. Packard, S. Wolfram. Two-Dimensional Cellular Automata. J. Statistical Physics. 38:901-946, 1985
CA states : tile (2) • (b) semi-random : upper sites oscillating, lower random • 8 states never occur, p = 0 • the other 8 states are equally likely, p = ⅛ • ¾ bit per site, or 3 bits for the tile • (c) oscillating • only 2 states occur, with p = ½ • ¼ bit per site, or 1 bit for the tile
CA entropy on rules (1) • N = number of neighbourhoods (entries in rule table) • pit = proportion of times rule table entry i is accessed at timestep t • define the rule lookup entropy at timestep t to be • initial random state, equal probabilities, max entropy • long term behaviour depends on CA class A. Wuensche. Classifying cellular automata automatically. Complexity 4(3):47-66, 1999
CA entropy on rules (2) • ordered • only a few rules accessed : low entropy • chaotic • all rules accessed uniformly : high entropy • complex • non-uniform access : medium entropy • fluctuations : high entropyvariance • use this to detect complex rules automatically
key properties of entropy H • H = 0 one of the p(xi) = 1 • if one value is “certain” (and hence all the other p(xj) are zero) • the minimum value H can take • can add zero probability states without changing H • if p is uniform, then H = log2N • if all values are equally likely • the maximum value H can take • uniform p “random” maximum “information” • so if p is not uniform, H < log2N • H is extensive (additive)for independent joint events[next lecture] • is the only function (up to a constant) that is uniform, extensive, and has fixed bounds
designing an entropy measure • design a measure p(x) analogous to a probability • mathematical properties • non-negative values • sum to one • entropic properties • p uniform when maximally disordered / random / chaotic • one p is close to one, the others all close to zero, when maximally ordered / uniform • use this p to define an associated entropy • validate : check it “makes sense” in several scenarios
Example: Entropy of a flock • N particles in 2D, indexed by i • at time t, positions and velocities relative to the flock mean position and velocity • form matrix F of these coordinates over T timesteps • readily extendable to 3 (or more!) dimensions W. A. Wright, R. E. Smith, M. Danek, P. Greenway. A Generalisable Measure of Self-Organisation and Emergence. ICANN 2001
aside: singular value decomposition (SVD) s2 s1 s3 • singular value decomposition “factorises” a matrix • S is a diagonal matrix; S is the vector of singular values constructed from its diagonal elements • U and V are unitary matrices – “rotations”, orthogonal basis • these singular values indicate the “importance” of a special set of orthogonal “directions” inthe matrix F • the semi-axes of the hyperellipsoid defined by F • if all directions are of equal importance (a hypersphere), all the singular values are the same • more important directions have larger singular values • generalisation of “eigenvalues” of certain square matrices
svd of a flock • calculate vector S, ofthe singular values of F • singular values are non-negative : si 0 • normalise them to sum to one : S si= 1 • use them as the “probability” in an entropy measure: • Matlab code sigma = svd(F); sigma = sigma/sum(sigma); entropy = - dot(sigma,log2(sigma));
ON ENTROPY OF FLOCKS • Many ways to analyse flock entropy • At boid level or at flock level • Over time, over space • Possible questions: • Can entropy identify formation of a flock? • Can entropy distinguish forms of social behaviour? • Several projects have explored these questions • Identification of flock formation is hard • The entropy of free boids distorts the results • Might be able to distinguish ordered behaviours from “random” swarms P. Nash, MEng project, 2008: http://www.cs.york.ac.uk/library/proj_files/2008/4thyr/pjln100/pjln100_project.pdf
Is entropy always the right measure? • Unpublished experiments on flock characteristics • YCCSA Summerschool, 2010: Trever Pinto, Heather Lacy, Stephen Connor, Susan Stepney, Fiona Polack • statistical measures of spatial autocorrelation are at least as good as entropy • At least for identifying turning and changes in flocking characteristics • e.g. the C measure related to Geary’s C ratio • where rpp is the minimum distance between two distinct boids (nearest neighbour distance) • and rlp is the minimum distance from a random point to a given boid • For random locations, distances are equal, C = 1 • Clustering indicated by C > 1 • Spatial autocorrelation can analyse clustering in any 2D space • e.g. clustering in position or clustering in velocity
Entropy as fitness of interesting systems • Can we evolve CA states for complex behaviour? • fitness function uses tile entropy of 3x3, and 15x15 tiles • low entropy = regular behaviour • high entropy = chaotic behaviour • want regions of regularity, and regions of chaos, and “edge of chaos” regions • i.e. maximum variation of entropy across the grid • fitness = variance in entropy Ht over NT tiles • (mean square deviation from the average) D. Kazakov, M. Sweet. Evolving the Game of Life. Proc Fourth Symposium on Adaptive Agents and Multi-Agent Systems (AAMAS-4), Leeds, 2004