1 / 23

Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit !

16 November 2011 Biologists Meeting. Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit !. Lisa Reed Center for Vector Biology Rutgers University. What this talk is about. Data Management Why is it important? How to do it effectively How to protect your data

aldona
Download Presentation

Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit !

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 16 November 2011 Biologists Meeting Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit ! Lisa Reed Center for Vector Biology Rutgers University

  2. What this talk is about • Data Management • Why is it important? • How to do it effectively • How to protect your data • Data Manipulations • What is it and what can it do for me? • Pivot and Graphs • Connectivity and Reports

  3. Data Management • Broad definition • Data Resource Management is the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise -Wikipedia • How you keep data in a clear, concise and safe manner that anyone can retrieve at a later date without confusion or frustration. • Data structure • Meta data • Safe keeping and retrieval

  4. Data Structure • Data structure is a “way of storing and organizing data [in a computer] so that it can be used efficiently.” (Wikipedia) • Should be easy to understand (for example, by date) • How is data stored? • Records: One line of data arranged by “fields” or “variables.” • Arrays: A set of data arranged by position to denote variables • How does Excel do it? • Records

  5. Excel Navigation – The Top Office Button Column Letter Ribbon Formula Bar Split Pane Name Box Active Cell Row Number Multiple Tabs

  6. Variables into Columns • First Row are Variable Names • Use names that make sense • Code • Format (date) A right mouse click on either column or cell > Format cells…

  7. Data Structure Expandable in both directions to 16,348 columns and 1,048,576 rows

  8. Things to Consider in Your Dataset • Do I use Zero? • Pro = you fill in each potential place setting with a value. • Pro = average calculate correctly! • Con = Can take a long time (but there is a short cut) • Con = Each entry is a possible data error • Do you have a way to distinguish when the trap is out of commission? • Use of period or other non-numerical value

  9. Calculations with and without Zero • Blanks are NOT considered zero. • Sums are counted correctly (can define average through a pivot table). • Can replace blanks with zeroes. Highlight>Find and Replace>Go To Special>Select Blanks>type 0>Control-Enter

  10. Meta Data • Meta = “About” • What are the data elements? • Descriptions about how the data is organized? • Coding and their meanings • Everything someone needs to know in order to use the dataset • Best place to put this information is in the dataset itself • Excel provides an good format for doing so.

  11. MetaData

  12. Saving Your Data • Save it. Save it Often. Save it with AutoSave. • Back it up. • Keep a copy in a safe place. • Keep a copy in a separate place. • Keep a copy in the cloud. “We have recently seen some really nasty malware on several computers that simply was not removable by numerous sophisticated software tools at our disposal. The only safe remedy was to completely rewrite the machines from scratch.”

  13. Data Manipulations • Exploratory Data Analysis vs. Confirmatory Data Analysis • What is your data’s story? • Hand entry – Know your data. • Summarizing data • Tables • Graphs • Pivot Tables • Graphs • Into a Document

  14. Pivot Table 1 • Highlight Data (hold down shift key and arrow down, then arrow across) • INSERT Pivot Table

  15. Pivot 2 • A new sheet is created • Can be modified • Lists variables in your dataset • Has place for column, row, value and filter

  16. Drag Variables to Label, Value and Filters • Drag Species to Row Labels and it will place the species names, one to each row • Drag Results to Column Label • Drag Pathogen to Filter • Drag Moscount to Values

  17. Manipulation of variables

  18. Pivot Example Excel

  19. Graphing Your Data • Highlight your data. • Apply a graph template. The greatest value of a picture is when it forces us to notice what we never expected to see. –Tukey 1977

  20. Live Links from One Document to Another • With “hot” links, your information will always update. • Saves time • Preserves method • You can also paste a static picture that is not linked to data. • Click on graph<Copy<Go to Document<Paste Special<Paste Link<Excel Object • Graph will be shown and can be updated with a right click<Update.

  21. Graphs into Word

  22. Conclusions • Data Management will provide continuity through data structure, meta data and data preservation. • Exploring data can provide insight and new questions. • You can link these explorations into different documents. Questions?

  23. 16 November 2011 Biologists Meeting Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit ! Lisa Reed Center for Vector Biology Rutgers University

More Related