1 / 29

Disclaimer

Research Using Data Mined from the Internet --Regulatory Considerations Laura Odwazny Senior Attorney Office of the General Counsel U.S. Department of Health and Human Services DOE CIRB meeting June 14, 2012. Disclaimer .

xiu
Download Presentation

Disclaimer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research Using Data Mined from the Internet --Regulatory ConsiderationsLaura OdwaznySenior AttorneyOffice of the General CounselU.S. Department of Health and Human ServicesDOE CIRB meetingJune 14, 2012

  2. Disclaimer This presentation does not constitute legal advice. The views expressed are the presenter’s own, and do not bind the U.S. Department of Health and Human Services or its components.

  3. Do Note: • OHRP has no guidance on Internet research specifically • Many boards have separate guidelines and best practices for Internet research

  4. Internet Research Internet research = research which utilizes the Internet to collect information through an online tool, such as an online survey; studies about how people use the Internet, e.g., through collecting data and/or examining activities in or on any online environments; and/or, uses of online datasets, databases, databanks, repositories. • Internet as a TOOLFOR research or… • Internet as a MEDIUM/LOCALE OF research • TOOL=search engines, databases, catalogs, etc… • MEDIUM/LOCALE=chat rooms, newsgroups, home pages, multi-player gaming sites, blogs, skype, tweeting, online course software, etc

  5. Forms of Research: Exploring Where Human Subjects Fit • Consider Methodologies, Venues, Types of Data Generated through: • Quantitative Research • Data Aggregation, Scraping, Transaction Log Analysis, Network Analysis, Statistical Analysis etc • Qualitative Research • Ethnography, Focus Groups, Observation, Surveys, Content/Discourse Analysis, etc

  6. Forms of Internet Research Venues • Email, IM, tweets • Listserves, chat rooms • Search engines, other archives • Social network sites, media sharing sites • Blogs and home pages • Virtual worlds • Online marketplaces, online gaming • Databanks, repositories • Venues other than “place-based), e.g. mobile data collection

  7. E-Data Raises New Ethical Challenges • Trackability • “Dataveillance” = data monitoring+ recording • “Greased” • “When information is computerized, it is greased to slide easily and quickly to many ports of call. But legitimate concerns about privacy arise when this speed and convenience lead to the improper exposure of information. Greased information is information that moves like lightning and is hard to hold onto.” • Malleability • Can be utilized in varied ways for multiple purposes • Invisibility Factor • Computer operations usually invisible; can allow for abuse James Moor, 1985

  8. Data aggregation/scraping

  9. Online Support Groups

  10. Twitter • Blurs the boundaries between public/private • Tweeter A (private)followed by Tweeter B (public)Tweeter B retweets A = Tweet A is now visible to Tweeter B’s public)feed • Track-backability is increased; consider sensitivity, reputation, risk/benefit • Archived Tweet Data fields: • country code: • id: • klout score • link: • location • coord type: • location coords: • location displayname: • location type: • posted time: • real name: • rule match: • tweet url:user twitter page: • username:

  11. Header Regulatory considerations

  12. Big regulatory issues… • What is “private”? • What is “identifiable”? • How to protect subjects’ privacy and confidentiality interests? • Minimizing risk when using sensitive online data • Current sensitivity vs. future sensitivity • Informational risks • Data security

  13. OHRP’s Analytic Framework for the Common Rule: Always Start With… • Is the activity subject to regulation? • Conducted or supported by a Common Rule agency? • Covered under an applicable FWA? • Is it research? • Does it involve human subjects? • Is it exempt? Keep in mind regulatory flexibilities: • Can it be expedited? • Waiver of informed consent? • Waiver of documentation of consent?

  14. Human subject .102(f): “a living individual about whom an investigator conducting research obtains (1) data through intervention or interaction with the individual, or (2) identifiable private information… Private information includes information about behavior that occurs in a context in which an individual can reasonably assume that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record). Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects. (emphasis added)

  15. Privacy in the Internet age Private • How to interpret “reasonably expect that no observation or recording is taking place” or “reasonably expect will not be made public” • IMs, tweets, email, FB profile, chatroom discussions, listserves • Must information be considered either “public” or “private”? • Members-only forum, community standards • Shifting norms about what information is “private” • What is a “reasonable” expectation of privacy in grid/Internet/e-data? • Expectations of privacy vs. actual privacy

  16. How should the IRB assess privacy? • What expectations of privacy are “reasonable”? • Get information about the environment • Get information about the users • Review Terms of Service • Data security consideration

  17. Human subjects (2) Identifiable • Individually identifiable = subject’s identity readily ascertainable by the investigator or associated with the information • Structure of social network, search terms, purchase habits, movie ratings on Netflix may uniquely identify individual • Zip code + sex + DOB enough for Latanya Sweeney to identify • Given demonstrated ability to reidentify individuals from anonymized or aggregated data, is this a meaningful decision point?

  18. How should the IRB assess identifiability? • When will the subject’s identity be “readily” ascertainable by the investigator or associated with the information? • Consider the investigator, e.g. Professor LaTanya Sweeney vs. Professor Elizabeth Buchanan • Consider the potential identifiers • Consider likelihood of reidentification with triangulation

  19. Header Exemption .101(b)(4) • Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects.

  20. Exemption .101(b)(4) applied • When is information “recorded in an identifiable manner” • Is an email address an identifier? • Do tweets contain identifiers? • Does the inclusion of IP address make information identifiable? • When are data, documents, or records publicly available on the internet? • Does “publicly available” include large datasets purchased/obtained from Google or Facebook? • What if data are semi-restricted -- available only to ‘friends’, listserve members?

  21. Key Considerations for IRB Review • What type of venue? • Expectations of privacy? • Consent procedures? • Sensitivity of data? • Harm/Risk? • Age verification? • Authentication of participants? • Identification of participants? • Use of encryption? • Storage/transmission of data?

  22. Other potential issues – international research • PI is proposing to collect data from publically accessible social media sites, some of which are hosted by servers outside of the US. The PI will collect all data from his computer in the US. Is the activity international research?” (from IRB Forum) • Consider EU data protection directive, Canadian laws, etc. if applicable!

  23. Stay tuned

  24. ANPRM– Implications for Internet research • Base concept of identifiability under Common Rule on HIPAA Privacy Rule standards of identifiability? • Tor protect from informational risks (inappropriate use/disclosure of information), mandatory data security measures “modeled on” HIPAA? • Apply Common Rule to all institutions receiving support from CR agency? • No continuing review for most minimal risk research?

  25. ANPRM – Proposals for “excused” research • Additional requirements for “excused” (formerly exempt) research? • Registration • Consent, oral or written, depending, with waiver contemplated • Oral w/o documentation for educational tests, surveys, focus groups, interviews • Data security standards • Retrospective auditing of portion of “excused” submissions

  26. Proposal: Revised scope of existing exemption 4 • Expansion of .101(b)(4) by removing “existing” and de-identified recording? • Keep collected for purposes other than the research

  27. ANPRM – consent and exempt research • Additional consent requirements for “excused” (formerly exempt) research? • Oral or written consent, depending, with waiver contemplated • Oral w/o documentation for educational tests, surveys, focus groups, interviews (modifying exemption 46.101(b)(2)) • Secondary use of data (modifying exemption 46.101(b)(4)) • originally collected for research purposes, consent required whether or not the researcher obtains identifiers • originally collected for non-research purposes, no change (no consent required unless identifiers are obtained)

  28. Your Experiences, Comments, Questions

More Related