480 likes | 618 Views
Trust and Epistemic Communities in Biodiversity Data Sharing. Nancy Van House SIMS, UC Berkeley www.sims.berkeley.edu/~vanhouse. Trust and Epistemic Communities in Biodiversity Data Sharing.
E N D
Trust and Epistemic Communities in Biodiversity Data Sharing Nancy Van House SIMS, UC Berkeley www.sims.berkeley.edu/~vanhouse
Trust and Epistemic Communities in Biodiversity Data Sharing • DLs: ready access to unpublished information by variety of users - crossing sociotechnical boundaries • Raises issues of trust and credibility • Knowledge is social • What we know, whom we believe is determined by/within epistemic cultures • Biodiversity data • Great variety of information, sources, purposes • CalFlora: an example of a user-oriented DL • Incorporating users’ practices of trust and credibility • Negotiating differences x epistemic cultures • Implications
DLs Facilitate Access • To greater variety of information: • Unpublished (unreviewed) information • “Raw” data such as reports of observations • Information from outside own reference group • Problems: • Which info, sources do we believe? • How do we evaluate info from unfamiliar sources? • Which info do we use for what purposes? • By people from outside own reference group • Inappropriate use of information? • Burden on data owner of making data available, usable, and understandable to reduce misuse
Examples of Risks– Botanical Information • Unreliable Info • Erroneous, duplicative observations >> belief that a species is prevalent >> not preserving a population of a rare species • Chasing after erroneous reported sighting of a rare species –or discounting significant sighting as amateur’s error • Inappropriate Use of Info • Private landowners destroying specimens of a rare plant to avoid legal limits on land development • Collectors (over-)collecting specimens of rare species
Knowledge is Social • What we know comes primarily from others. • Cognitive efficiency: we don’t have time, resources • Expertise: we don’t have sufficient knowledge in all areas • Have to decide whom we trust, what we believe. • What we consider “good“ work, whom we believe and, how we decide are determined and learned in epistemic communities • DLs need to support the diverse practices of epistemic communities
Social Nature of Knowledge is of Concern in Many Areas • Science studies • Inquires into the construction of scientific knowledge & authority • Social epistemology • Asks: How should the collective pursuit of knowledge be organized? • Situated action/learning • Posits knowledge, action, identity, and community to be mutually constituted • Knowledge management • Is concerned with how to share knowledge
Cognitive Trust and DLs • For people to use a DL: • Information must be credible • Sources must be trustworthy • DL itself must be perceived to be trustworthy • How can DLs be designed to: • Facilitate users’ assessments of trust and credibility of info and sources? • Demonstrate their own trustworthiness?
Epistemic Cultures • “…those amalgams of arrangements and mechanisms … which, in a given field, make up how we know what we know.” • “Epistemic cultures…create and warrant knowledge, and the premier knowledge institution throughout the world is, still, science.” Karen Knorr-Cetina, Epistemic Cultures
Culture • Context of history and on-going events • Practice: how people actually do their do-to-day work • Artifacts • Info artifacts include documents, images, thesauri, classification systems • Diversity • If all the same, no culture • Including diversity x areas of science
Epistemic Cultures Differ • Practices of work • Practices of trust • Artifacts – e.g. genres • Methods of data collection and analysis • Meanings, interpretations, understandings • Tacit knowledge and understandings • Values • Methods, standards, and information for evaluating other participants’ work and values • Institutional arrangements
Communities and Knowledge • Becoming a member of a community of practice = identity • learning practices, values, orientation to the world • We learn what to believe, whom to believe, how to decide in epistemic communities. • We tend to trust people from within our own epistemic communities. • Similar values, orientation, practices, standards • Ability to assess their credibility
DLs and Epistemic Cultures • DLs enable information to cross epistemic communities. • More easily, more often than before. • Raw data, not just syntheses, analyses – e.g. publications • Crossing communities often undermines our practices of trust. • Who are these people? • How did they collect the data? • What do they know? • What are their goals, values, priorities? • DLs need to be designed to support practices of assessing trustworthiness and credibility.
Biodiversity Data • Biodiversity: studies diversity of life and ecosystems that maintain it • Central question: change over space and time • Uses large quantities of data that vary in: • Precision and accuracy • Methods of data collection, description, storage • Old data particularly valuable • Broad range of datasets: biological, geographical, meteorological, geological… • Created and used by different professions, disciplines, types of institutions…for different purposes • Politically, economically, sensitive data
“Citizen Science” • Fine-grained data from observers in the field • Observers with varying levels and types of expertise • E.g., expert on an area, habitat, taxon… • Expert amateurs • Private-public cooperation • Government agencies, environmental action groups, university herbaria, membership organizations, concerned individuals…
CalFlorahttp://www.calflora.org • Comprehensive web-accessible database of plant distribution information for California • Independent non-profit organization • Designed/managed by people from botanical community, not librarians or technologists • Free • In conjunction with UC Berkeley Digital Library (http://elib.cs.berkeley.edu)
CalFlora Target Users • Researchers & prof’ls in land management • Ready access to data for • Addressing critical issues in plant biodiversity • Analyzing consequences of land use alternatives and environmental change on distribution of native and exotic species • The public: promoting interest in biodiversity • Active engagement in biodiversity issues/work • Wildflowers as “charismatic”
CalFlora Priorities • Focus on people; put technology in the back seat • Pay attention to how the world works for the people who produce and use information • Honor existing traditions of data exchange
CalPhotos CalFlora Occurrence Database Components of Interest Today
CalPhotos • In conjunction with the UC Berkeley Digital Library Project http://elib.cs.berkeley.edu • > 28,000 images of California plants • Approx. half of all Calif. species are represented • Sources • Some institutions – e.g. Cal Academy of Sciences • Many from “native plant enthusiasts” • Currently accepting/soliciting contributions from users • Major reported uses • Plant identification • Illustrations
CalFlora Occurrence Database • > 800,000 geo-referenced reports of observations • Specimens in collections • Reports from literature • Reports from field • Checklists • Sources • 19 institutions • About to begin accepting reports from registered contributors via Internet
CalFlora Occurrence Database • Users can • “Click through the map to underlying data” • Download data for own analyses, tools • Uses • Land management decisions • Legally-mandated environmental reports (NEPA, CEQA) • Identify plants (though not designed for this) • Common analyses • Which species are present in an area • Which are common, which are rare • Which species are restricted to a habitat affected by proposed actions • Analyze various species in combination, by geo area
CalFlora Occurrence Database: Significance • Most comprehensive source by far (for Calif) • Common as well as rare taxa • Biodiversity beginning to be interested in all populations, not just rare -- requires vastly more data • Data downloadable, manipulable • Easy to use (for professionals, anyway) • Remote access via Internet • E.g. botanist in remote National Forest… • About to accept observations from “the public” • Source of valuable data re rare and esp’ly common species
Dilemmas and Conflicts • Useful place to see tensions, breakdowns, conflicts across epistemic cultures • Not whose right, wrong but underlying differences in values, priorities, practices, understandings
CalFlora Dilemmas • Quality filtering: made centrally vs. pushed down to user • Inclusiveness of observations vs. selectivity • Speed of additions vs. review, filtering • Labelling data for quality vs. providing info for users • Access • Benefits vs. dangers of wide access to information • Free vs. fee • Cost recovery • Discourage frivolous use • Who bears the costs? • Externalities
Dilemmas, Cont. • Institutional independence: • Autonomy, ability to be responsive to multiple stake-holder communities vs. security and credibility of institutional sponsorship
How (Some) Experts Assess Occurrence Reports • The evidence: • Type of report (specimen, field observation, list) • Type of search (casual, directed) • The source: • Personal knowledge of contributor’s expertise • Examination of other contributions, same person • Annotations by trusted others • Ancillary conditions: • Likelihood of that species appearing at that time, habitat, geographical location • Other, similar reports
How CalFlora Presents Occurrence Data • Links to data source(s) – personal and institutional • Compliance with institutional source’s requirements • Fuzzed locations • Links to institutional source’s caveats, explanations • Publicly-contributed observations • Info about observer • Info about observation • Annotations by experts
Contributor Registration • Biography, credentials (free text) • Expertise/interests (free text) • Affiliation • Contact info/web site • “I will submit only my own observations of wild plants. I realize that this system is only for first-hand reports about plants, native and introduced, that are growing without deliberate planting or cultivation.” • “I will…make sure I have the correct scientific name…I will submit uncertain identifications only if I believe them to be very important and time sensitive, and will label such reports ‘uncertain.’”
Contributor Registration (cont) • Experience level (self-assessment; check one) • I am a professional biologist/botanist, or have professional training in botany. • Although I do not have formal credentials, I am recognized as a peer by professional botanists. • Although I do not consider myself to have professional-level knowledge, I am quite experienced in the use of keys and descriptions, and/or have expertise with the plants for which I will be submitting observations. • I do not have extensive experience or background in botany, but I am confident that I can accurately identify the plants for which I will be submitting observations.
Occurrence Form • Species identification, habitat, location, date • Method of identification • “I recognize …from prior determinations and experience” • “I compared this plant with herbarium specimens” • “I keyed this plant in a botanical reference” • “I compared … with published taxonomic descriptions” • “An expert reviewed and confirmed this identification” • Certainty of identification • “I am confident of this identification, and submit this as a positive observation.” • “I am not certain of this identification but believe it to be a significant observation and submit it here as an alert only.”
Annotations • Herbarium practice: experts annotate records with corrections, comments. • CalFlora: registered experts can annotate photos and occurrence records. • Annotation by an expert raises the credibility of a record. • Actually – how often?
CalFlora Data and Trust • Trusting data • Every observation trackable to source(s) • Detailed info & contact info for source, observer • Detailed info about observation • Observations categorized by type • Annotation • Trusting users • NOT registering or charging users • Respecting source’s limits, caveats on data • Leaving quality decisions to the users • Trusting CalFlora • Detailed list of contributing organizations, advisors • NOT affiliated with another organization
Concerns • CalFlora relies on record-by-record examination • Looking at methods of classifying records in collections • CalFlora relies on voluntary contributions of data • Experts with lots of data and no time to contribute • Well-meaning volunteers with time but not expertise • Users need to be able to track back to source of each record, each data point • Concern about “modalities,” uncertainties being lost • Archiving • Concern about dynamicism of CalFlora • Stability of electronic media • Stability of the organization • Delegating decisions about quality of observations to (inexpert) users
Implications for DLs, Other Info Systems • The social nature of knowledge • We have to decide on whom we will depend • We learn from others whom and what we can depend on • Information must be credible to be used • The importance of culture in constituting knowledge • Practice, values, orientations… • Epistemic cultures differ • Not simply a matter of experts vs. public
Therefore: • DLs need to accommodate practices • Incl. practices of trust and credibility • Users need to know provenance of data • Users differ • and not just experts vs. nonexperts • DLs serve multiple, varied epistemic cultures • Same person,multi cultures • Users need flexibility to accommodate the DL to their needs, practices • Some users need decisions made for them • >> involvement of users in design
Implications for DL Creation and Management • Different epistemic cultures participate in the design and management of DLs, as well • Librarians • Technologists • Various, differing user groups • Differences in practices, understandings, values >> differences in priorities and decisions • A continual process of negotiation and translation