1 / 25

Science Mapping and Applications: Choices and Trade-offs

Science Mapping and Applications: Choices and Trade-offs. Standards Workshop August 11, 2011. Kevin W. Boyack, SciTech Strategies. Agenda. Background Applications Choices and Trade-offs SciTech Choices SciTech Mapping Process Applications Enabled by SciTech Process

Download Presentation

Science Mapping and Applications: Choices and Trade-offs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Science Mapping and Applications: Choices and Trade-offs Standards Workshop August 11, 2011 Kevin W. Boyack, SciTech Strategies

  2. Agenda • Background • Applications • Choices and Trade-offs • SciTech Choices • SciTech Mapping Process • Applications Enabled by SciTech Process • Additional Findings Related to Choices • Summary

  3. Science Mapping • Most common definition: visualizing or creating a visual map of scientific information • Documents, journals, authors, words … • This requires • Data acquisition and cleaning • Defining elements and similarity • Partitioning and/or layout of the data • And then one creates the visual

  4. Why Do We Map Science? • Basic understanding • Structure and dynamics of science • High level, all of science • Discipline, specialty • Researchers and groups • Search and retrieval • “More like this”, “related articles” • Metrics • Research • Evaluation • Policy / Decision Making

  5. Why Do We Map Science? Academic View World View • Structure and dynamics • High level, all of science • Discipline, specialty • Researchers and groups • Search and retrieval • “More like this”, “related articles” • Metrics • Research • Evaluation • Policy • Structure and dynamics of science • High level, all of science • Discipline, specialty • Researchers and groups • Search and retrieval • “More like this”, “related articles” • Metrics • Research • Evaluation • Policy

  6. Map of Science & Technology Indicators

  7. Choices • Data source • Unit of analysis • Document, Journal, Author, Word (phrase) • Sample size • Specialty, Discipline, All of science, Single year, Multiple years • Similarity approach • Citation, Text, Thresholds • Partitioning and/or layout • MDS, K-K, F-R, DrL (OpenOrd), Blondel, …

  8. Trade-offs

  9. Mapping Choices • Are often made based on what is available • Data, algorithms, expertise • When they should be made based on • The research question or application • Balancing of the applicable trade-offs • If the application is EVALUATION • Special care must be taken • Partition accuracy is critical • What is a discipline?

  10. Our Choices

  11. Standards?

  12. Possible Standards • Units • Data sources • Data cleanliness • Workflows • Validation • Datasets • Products (e.g. reference maps?) • …

  13. Our Process Generate annual models, 2000-2008 Select single year of source data Apply thresholds to select references Generate relatedness matrix (k50/cosine) Filter matrix to top-n values per reference Link annual models Research communities Link communities from adjacent years using overlaps in their intellectual bases  threads Cluster references  intellectual base Assign current papers  research front

  14. Process-Specific Applications • Highly granular model of science gives detailed structure

  15. Process-Specific Applications • Detailed structure enables metrics for multidisciplinary sets of topics • Competenciesis the name we use for thesesets of topics

  16. Process-Specific Applications • Research leadership (based on competencies) rather than research evaluation (based on rankings) UCSD TAMU Delhi UCSF MIT Stellenbosch

  17. Process-Specific Applications • Mapping all of science shows context for target literature

  18. Process-Specific Applications • Linking annual models shows detailed evolution and instability • Consider a topic in a given year. It can be linked (or not) in time in the following ways: Following year linkage Prior year linkage

  19. Process-Specific Applications • Linkage types correlate with various properties (e.g. textual coherence, prolific authors) • These can be used as predictors of future linkage • Identification of emerging topics at the micro-level Following year linkage Isolate 34% Birth 14% Stasis Prior year linkage Death 11% Continuation 41%

  20. Process-Specific Applications • Superstar authors (top 1%) are predictive • Superstars avoid isolates • Superstars move out of topics before they die • Superstars are heavily involved in splits and merges (non-stasis parts of continuation) Isolate 34% Birth 14% Stasis Death 11% Continuation 41%

  21. Accuracy of Similarity Approaches • Identified a large data corpus (2.15M articles) • Textual information (abstracts, MeSH) from Medline • Citation information from Scopus • Compare 13 approaches – 3 citation, 9 text, 1 hybrid • For each approach • Calculate article-article similarities • Filter similarities to reduce the number to a similarity file of reasonable size (2.15M articles, 20M sim pairs) • Cluster the articles • Measure accuracy of the cluster solution using multiple metrics • Coherence (Jensen-Shannon divergence) • Grant-to-article linkage concentration • Analyze results

  22. Results

  23. Summary • Science mapping involves a great number of choices • There are many trade-offs to consider • Choices constrain or enable analysis types • SciTech has made a set of mapping choices, and developed a methodology that enables • Development of a detailed, multi-year model of science for structure and evolution • Metrics on competencies (sets of topics) that are important at the institutional level • Context to be mapped along with any target literature • Prediction of continuance • Identification of emerging topics • We continue to explore accuracy and usefulness of maps

  24. Relevant Literature • Boyack, K. W., Newman, D., Duhon, R. J., Klavans, R., Patek, M., Biberstine, J. R., Schijvenaars, B., Skupin, A., Ma, N., & Börner, K. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLOS One, 6(3): e18029. • Klavans, R., & Boyack, K. W. (2011). Using global mapping to create more accurate document-level maps of research fields. Journal of the American Society for Information Science and Technology, 62(1), 1-18. • Boyack, K. W., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389-2404. • Klavans, R., & Boyack, K. W. (2010). Toward an objective, reliable, and accurate method for measuring research leadership. Scientometrics, 82(3), 539-553. • Klavans, R., & Boyack, K. W. (2009). Toward a consensus map of science. Journal of the American Society for Information Science and Technology, 60(3), 455-476. • Boyack, K. W. (2009). Using detailed maps of science to identify potential collaborations. Scientometrics, 79(1), 27-44. • Klavans, R., & Boyack, K. W. (2006). Quantitative evaluation of large maps of science. Scientometrics, 68(3), 475-499. • Klavans, R., & Boyack, K. W. (2006). Identifying a better measure of relatedness for mapping science. Journal of the American Society for Information Science and Technology, 57(2), 251-263. • Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351-374. • Boyack, K. W. (2004). Mapping knowledge domains: Characterizing PNAS. Proceedings of the National Academy of Sciences, 101, 5192-5199. • Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37, 179-255.

  25. Thank you!

More Related