310 likes | 480 Views
From swarming to collaborative filtering. http:// www.csml.ucl.ac.uk/images/Netflix_Prize.jpg. Informatics: a possible parsing. Computer Science. STOP! ;-). b. b. b. a. a. a. b. a. b. b. a. a. b. a. b. Psilophyta/Psilotum. Let’s Observe Nature!. What do you see?
E N D
From swarming to collaborative filtering. http://www.csml.ucl.ac.uk/images/Netflix_Prize.jpg
Informatics: a possible parsing • Computer Science STOP! ;-)
b b b a a a b a b b a a b a b Psilophyta/Psilotum Let’s Observe Nature! What do you see? • Plants typically branch out • How can we model that? • Observe the distinct parts • Color them • Assign symbols • Build Model • Initial State: b • b -> a • a -> ab • Doesn’t quite Work! a b
Complex systems approach: looking at nature • A complex system is any system featuring a large number of interacting components (agents, processes, etc.) whose aggregate activity is nonlinear • not derivable from the summations of the activity of individual components • Network identity: Components form aggregate structures or functions that requires more explanatory devices than those used to explain the components • Genetic networks, Immune networks, Neural networks, Social insect colonies, Social networks, Distributed Knowledge Systems, Ecological networks • Bottom-up Methodology • Collections of simple units interacting to form a more complex hole • Study of Simple Rules that Produce Complex Behavior • Discovery of Global Patterns of behavior
b b b a a a b a b b a a b Psilophyta/Psilotum a b What about our plant? • An Accuratemodel requires • Varying angles • Varying stem lengths • Randomness • The Fibonacci Model is similar • Sneezewort: a b
Fibonacci Numbers! • Rewritingproduction rules • Initial State: A • A -> B • B -> AB • n=0 : A • n=1 : B • n=2 : AB • n=3 : BAB • n=4 : ABBAB • n=5 : BABABBAB • n=6 : ABBABBABABBAB • n=7 : BABABBABABBABBABABBAB • The length of the string is the Fibonacci Sequence • 1 1 2 3 5 8 13 21 34 55 89 ... • Fibonacci numbers in Nature • Livio (2003) The Golden Ratio: The Story of PHI, the World's Most Astonishing Number
Another example: flocking in nature • Flocking occurs when large groups of animals of the same species form aggregates that behave like a coherent, single entity • Herds, flocks, schools, swarms, humans • Properties: • Collectiveflight, migration, foraging, “drafting” • Coherence: aggregate has its own distinguishable system behavior and form • Adaptive: behavior of aggregate responds and adapts to external events (predators) • Coordination: behavior of individuals seems to be indicative of central control or symbolic/long-range communication, but isn’t
How to model flocking behavior? • Describing properties of aggregate behavior will only go so far: • Study shapes of aggregate • Situations in which it occurs • Dynamics, features of behavior • Biologists fixing radios? • Lessons from complex systems: • Complex systems behavior: not derivable from the summations of the activity of individual components • Network identity: Components form aggregate structures or functions that requires more explanatory devices than those used to explain the components ~ emergence • Bottom-up Methodology: • Collections of simple units interacting to form a more complex hole • Study of Simple Rules that Produce Complex Behavior Parrish(2002) – Self-organized fish schools
Models of flocking behavior • Boids: Craig Reynolds “Flocks, Herds and schools”, SIGGRAPH 21(4),1987 • Visual model of bird flocks • Lack of centralized control • Lack of symbolic communication • General approach: Local computation, i.e. each individual maximizes: • Collision avoidance: steer away from impact • Speed matching: match speed of neighboring birds • Flock centering: steer towards perceived flock center • Flock behavior = emerges from interactions of large groups of such construed individuals
Ant trails: emergent organizaton driven by communication • Problem: optimize location and extraction of food source • Lack of centralized control • Lack of symbolic communication • General modeling approach: • Local computation leads to higher order emergent computation • Walk algorithm probabilistic, but biased by pheromone concentration • Ants leave pheromone trail when food is found • Pheromone evaporates with time • Find shortest path • Note: • ~ greedy algorithm: hill-climbing on trail strength leads to adaptive, collective behavior • Approaches to address traveling salesman problem: BIOS group: S. Kaufmann (Santa Fe), see also M. Dorigo(2006) Ant Colony Optimization-IEEE Computational Intelligence Magazine for overview
Abstracted: Stigmergy • Stigma + ergon: sign + work • Indirect communication between various agents through environment, traces in environment • Lack of centralized control • Environment provides substrate for • communication, • information storage • Constrains individual agents • Emergence of complex, collective and goal-directed behavior • Observed in social insects: termites, ants, bees • Increasingly applied to social phenomena and technology/engineering • See: • Heylighen F. (2007). Why is Open Access Development so Successful? Stigmergic organization and the economics of information, in: B. Lutterbeck, M. Baerwolff & R. A. Gehring (eds.), Open Source Jahrbuch 2007, Lehmanns Media, 2007, p. 165-180. • http://www.mitpressjournals.org/doi/abs/10.1162/106454699568692
Probabilistic cleaning: ants • Very simple rules for colony clean up • Pick dead ant. if a dead ant is found pick it up (with probability inversely proportional to the quantity of dead ants in vicinity) and wander. • Drop dead ant. If dead ants are found, drop ant (with probability proportional to the quantity of dead ants in vicinity) and wander. See Also: J. L. Deneubourg, S. Goss, N. Franks, A. Sendova-Franks, C. Detrain, L. Chretien. “The Dynamics of Collective Sorting Robot-Like Ants and Ant-Like Robots”. From Animals to Animats: Proc. of the 1st Int. Conf. on Simulation of Adaptive Behaviour. 356-363 (1990). Figure by Marco Dorigo in Real ants inspire ant algorithms
Ant-inspired robots • Rules (Becker et al, 1994) • Move: with no sensor activated move in straight line • Obstacle avoidance: if obstacle is found, turn with a random angle to avoid it and move. • Pick up and drop: Robots can pick up a number of objects (up to 3) • If shovel contains 3 or more objects, sensor is activated and objects are dropped. Robot backs up, chooses new angle and moves. • Results in clustering • Theprobabilityofdroppingitemsincreaseswithquantityofitems in vicinity Figure from R Beckers, OE Holland, and JL Deneubourg [1994]. “From local actions to global tasks: Stigmergy and collective robotics”. In Artificial Life IV.
Luc Steels et al: ant algorithms http://www.youtube.com/watch?v=93LwvuxDbfU
Adaptive information systems Swarm Smarts. 78. Scientific American March 2000. ERIC BONABEAU Johan Bollen (1994): adaptive hypertext systems
Shameboy Plastic Operator [Shameboy, Plastic Operator, Figurine,…] Buyer 1 [1, 1, 0, 0, 0,…] Buyer 2 [1, 0, 0, 0, 0,…] Recommender systems: general principles • People ~ n-dimensional vectors • Person = { CD/book purchases, DVDs rented, …} • Vector is a representation of consumer. Entries can be weighted (TFIDF etc) • “Vector Space Model” • Calculate similarity of users: • Correlation of user vectors • Cosine similarity • Group consumers according to similarity: clustering • Similar users: discrepancies in vectors are recommendations • Used for all sorts of applications • Similar problem to “bad of words” • Multiple user personalities? • Orthogonality? • Same = better?? Angle: Consumer Similarity
Tracking scientists (they are people too!) http://informatics.indiana.edu/jbollen/PLosONEmap André Skupin Borner/Ketan (2004) PNAS 101(1) Highly recommended: http://www.scimaps.org/ Bollen J, Van de Sompel H, Hagberg A, Bettencourt L, Chute R, et al. 2009 Clickstream Data Yields High-Resolution Maps of Science. PLoS ONE 4(3): e4803. doi:10.1371/journal.pone.0004803
We’re all ants now? • User vectors: • Represent individual trail/exploration in n-dimension information space • Recommender systems: • bias probabilistic exploration paths of users based on others’ actions • Higher probability of following existing trails • Analogy: • Set of user vectors + recommender system ~ ant trails • Solving traveling salesman in n dimensions? ;-) • Modeling fads, hypes, flashcrowds in cyberspace, self-fulfilling prophecies, but also long tail effects, more optimized exploration of information space? • Which features of recommender systems promote either of the above? • Cf. youtube.com: “other users are watching” vs. batch-processed recommendations • Emergence of COMPUTATIONAL SOCIAL SCIENCE • http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2745217/ • Lazer et al (2009). Life in the network – the coming age of computational social science. Science. 2009 February 6; 323(5915): 721–723. documents recommender interface
Next week readings Gouth (2009) Training for Peer Review. Science Signaling 2 (85), tr2. [DOI: 10.1126/scisignal.285tr2] MONASTERSKY (2005) The number that is devouring science. Chronicle of higher education, Section: Research & Publishing Volume 52, Issue 8, Page A12 Eysenbach G, 2006 Citation Advantage of Open Access Articles. PLoSBiol 4(5): e157. doi:10.1371/journal.pbio.0040157 Lance Fortnow (2009) Time for Computer Science to Grow Up. Communications of the ACM, august, 52(8) doi:10.1145/1536616.1536631
Your proposal assignment From the syllabus: “You have a dream. Write it down in the form of a proposal for the NSF Graduate Research Fellowship Program (CISE field of study). This proposal accounts for 20% of your final grade. 10% for the final presentation in class.” What is the NSF Graduate Research Fellowship Program ? “The program recognizes and supports outstanding graduate students who are pursuing research-based master's and doctoral degrees in fields within NSF's mission. The GRFP provides three years of support for the graduate education of individuals who have demonstrated their potential for significant achievements in science and engineering research. The ranks of NSF Fellows include individuals who have made transformative breakthroughs in science and engineering research and have become leaders in their chosen careers and Nobel laureates.” http://www.nsf.gov/pubs/2010/nsf10604/nsf10604.htm
How does it work? • NSF wants: • Personal Profile, • Education and Work Experience, • Planned Graduate Program, • ** Personal Statement (2p) • ** Previous Research Experience (2p) • ** Proposed Plan of Research and References (2p) • I want: • The items marked with **, i.e. a total of 6 pages + references = 20% of grade • A 15’ in-class presentation of your work (December 1 and December 8) = 10% of grade
Formatting • ** Personal Statement (2p) • ** Previous Research Experience (2p) • ** Proposed Plan of Research and References (2p) • Maximum length of two pages, including all references, citations, charts, figures, and images. • Standard 8.5" x 11" page size, 12-point, Times New Roman font, 1" margins on all sides, and must be single spaced or greater • No hyperlinks, only citations in References Cited section. Images may be included in the page limits.
Personal statement • Important questions to ask yourself before starting the essay: • Why are you fascinated by your research area? • What examples of leadership skills and unique characteristics do you bring to your chosen field? • What personal and individual strengths do you have that make you a qualified applicant? • How will receiving the fellowship contribute to your career goals? • How do these activities address the Intellectual Merit and Broader Impacts criteria? • Example: • http://www.mitbrandon.com/nsfstatement.shtml
Previous Research Experience • Important questions to ask yourself before starting the essay: • What are all of your applicable experiences? • For each experience, what were the key questions, methodology, findings, and conclusions? • Did you work in a team and/or independently? • How did you assist in the analysis of results? • How did your activities address the Intellectual Merit and Broader Impacts criteria? • Example: • http://rachelcsmith.com/NSF/DisturbancePRE.pdf
Proposed Plan of Research Review criteria: http://www.nsfgrfp.org/how_to_apply/review_criteria http://www.nsf.gov/pubs/2002/nsf022/bicexamples.pdf Intellectual merit: How important is the proposed activity to advancing knowledge and understanding within its own field or across different fields? How well qualified is the proposer (individual or team) to conduct the project? (If appropriate, the reviewer will comment on the quality of prior work.) To what extent does the proposed activity suggest and explore creative, original, or potentially transformative concepts? How well conceived and organized is the proposed activity? Is there sufficient access to resources?
Proposed Plan of Research Review criteria: http://www.nsfgrfp.org/how_to_apply/review_criteria http://www.nsf.gov/pubs/2002/nsf022/bicexamples.pdf Broader Impacts – Activities and projects that: How well does the activity advance discovery and understanding while promoting teaching, training, and learning? How well does the proposed activity broaden the participation of underrepresented groups (e.g., gender, ethnicity, disability, geographic, etc.)? To what extent will it enhance the infrastructure for research and education, such as facilities, instrumentation, networks, and partnerships? Will the results be disseminated broadly to enhance scientific and technological understanding? What may be the benefits of the proposed activity to society?
General words of wisdom • You need to mind the formal criteria/check lists etc but what really matters: • Don’t so much focus on the task or burden of writing a proposal, but on the pleasure of outlining an interesting, relevant and successful research agenda. • You are asking for support ($$$). Someone will make a decision to support your research. They need to a see compelling reason to do so. Your essay must make a good scientific and societal case for why one should invest in your idea and professional development. • Make clear that you are qualified and well-positioned to execute what you propose. • Start with the big issues. The limit is 2 pages, so try to be as succinct and to the point as you can. Make it work. Focus on the why, then on the how. • Quality of exposition matters. Don’t annoy reviewers with jargon, crummy grammar, overly long sentences. • Be mindful of your audience. Your reviewers will be experts but not the degree that you may be.
About assignment 2 • Some changes you want to be mindful of: • NEW deadline = December 1st, 4PM (16:00) • Submission: two types • Partial: regular submission of Assignment 2 as planned. Graded on 25 point scale. Grade for assignment 1 is maintained. • Full: submission of 1 single assignment that comprises and integrates both assignments 1 and 2. Graded on 50 point scale. Expectation: SIGNIFICANT improvement in portion relevant to assignment 1. I will grade accordingly. Correct answer to algorithm is NOT sufficient.