200 likes | 333 Views
Unsupervised Creation of Small World Networks for the Preservation of Digital Objects. Charles L. Cartledge Michael L. Nelson Old Dominion University Department of Computer Science Norfolk, Virginia. Order of Presentation. Technology enablers Constraints Simple rules for Complex Behavior
E N D
Unsupervised Creation of Small World Networks for the Preservation of Digital Objects Charles L. Cartledge Michael L. Nelson Old Dominion University Department of Computer Science Norfolk, Virginia
Order of Presentation • Technology enablers • Constraints • Simple rules for Complex Behavior • Simulation approach • Simulation results • Future work JCDL Short Paper Presentation
1907 2007 2107 Time Motivation JCDL Short Paper Presentation
Technology Enablers Cost data: http://www.archivebuilders.com/whitepapers/22011p.pdf JCDL Short Paper Presentation
“ … Tomorrow we could see the National Library of Medicine abolished by Congress, Elsevier dismantled by a corporate raider, the Royal Society declared bankrupt, or the University of Michigan Press destroyed by a meteor. All are highly unlikely, but over a long period of time unlikely events will happen. …” (emphasis CLC) W. Y. Arms, “Preservation of Scientific Serials: Three Current Examples,” Journal of Electronic Publishing, Dec., 1999 Constraints 75 12 – 101 yrs 80 Expectancy data: http://www.cdc.gov/nchs/data/nvsr/nvsr57/nvsr57_14.pdf Those that die, do so in avg. 23 yrs. 5 – 60 yrs http://www.lbl.gov/Science-Articles/Archive/ssc-and-future.html http://www.dod.mil/brac/ http://www.hq.nasa.gov/office/pao/97budget/zbr.txt Picture: Patricia W. and J Douglas Perry Library, Old Dominion University http://www2.westminster-mo.edu/wc_users/homepages/staff/brownr/ClosedCollegeIndex.htm JCDL Short Paper Presentation
Reynolds’s Rules for Flocking • My interpretation • Namespace collision avoidance • Following others to available storage locations • Collision Avoidance: avoid collisions with nearby flock mates • Velocity Matching: attempt to match velocity with nearby flock mates • Flock Centering: attempt to stay close to nearby flock mates • Deleting copies of one’s self to provide room for late arrivers Images and rules: http://www.red3d.com/cwr/boids/ JCDL Short Paper Presentation Doctoral Consortium 6
Small World Shorter Still high Types of Graphs (Each graph has 20 vertices and 40 edges.) JCDL Short Paper Presentation
Desirable Graph Properties JCDL Short Paper Presentation
Unsupervised Small World Graph Creation • gamma = 0.0 • alpha = 0.99 • gamma = 0.7 • alpha = 0.99 • 0.2 <= beta <=0.66 • gamma < 0.6 CC is shown as dark lines L is shown as light lines JCDL Short Paper Presentation
Creation (Human or archivist activities) Wandering (Autonomous activities) Connecting (Autonomous activities) Flocking (Autonomous activities) Phases/Activities JCDL Short Paper Presentation
Creation Any DO JCDL Short Paper Presentation
Wandering A B Who are you connected to? Who are you connected to? Who are you connected to? Who are you connected to? Connected to: B Connected to: A Connected to: <Nil> Connected to: A Who are you connected to? Who are you connected to? Connected to: A Connected to: B, C D C JCDL Short Paper Presentation
Connecting A B Connection NOT established Possible connection Connection established Possible connection D C JCDL Short Paper Presentation
Flocking A B C’ D’ A’ A’’ D C D’’ A’’ A’ C’’ JCDL Short Paper Presentation
Typical Simulation Parameters • alpha = 0.5 • beta = 0.6 • gamma = 0.1 • Number of DOs = 1000 • Number of hosts = 1000 • Min number desired replicas = 3 • Max number desired replicas = 10 • Max number of replicas per host = 20 JCDL Short Paper Presentation
Simulation Results and Analysis JCDL Short Paper Presentation
Future work • Test the autonomous graphs for resilience to error and attack • Test what happens when a graph becomes disconnected • Test what happens when a disconnected graph becomes re-connected JCDL Short Paper Presentation
Conclusions • We have shown that Digital Objects can autonomously create small world graphs based on locally gleaned data • These graphs can be used for long term preservation • We intend to study these graphs focusing on their tolerance to isolated and widespread failures JCDL Short Paper Presentation
And that concludes my presentation. JCDL Short Paper Presentation
Backup Information • Equations for Average Path Length and Clustering Coefficients JCDL Short Paper Presentation