1 / 46

Infotools Harmoni

Infotools Harmoni. IHU: Data Sources. August 2017 Presented by: Janine Takle. Data Sources. A data source is the starting point for all you do in Infotools Harmoni A source is a collection of data that is either in an imported file, or in a connected data store

manes
Download Presentation

Infotools Harmoni

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. InfotoolsHarmoni IHU: Data Sources August 2017Presented by: Janine Takle

  2. Data Sources • A data source is the starting point for all you do in Infotools Harmoni • A source is a collection of data that is either in an imported file, or in a connected data store • Each record in a source corresponds to a count in Infotools Harmoni

  3. Sources vision • Our vision is for any appropriate data source to be analysed, have insights visualised, then distributed, all within Infotools Harmoni • Connections that facilitate real time updates are expected to be the way of the future • We have a continuous development stream dedicated to adding connections and supporting new sources • We are strongly influenced by the needs of our clients when identifying additional sources to support

  4. Source Types

  5. Imported files • Imported files capture data for a moment in time • Files that can be imported into Infotools Harmoni include: • SAV files • X-files • MDD/DDF – Dimensions files • XLSX – Excel files • TXT files – tab delimited text • CSV files – comma delimited text

  6. Direct connections • Direct connections allow updated data to flow into Infotools Harmoni in real time • Direct connections are achieved using APIs (Application Programming Interface). • If an API is available in a client’s data collection system we can potentially develop a direct connection • We currently support direct connections to: • Tables in SQL Server • Views in SQL Server • Projects in Decipher

  7. Source Structures

  8. Source content • Sources can be divided into 2 types – record level and summary level • Record level sources are sources where analysing the count of records is relevant, e.g. where a record represents a respondent, a log entry, a sales transaction, etc. • Summary level sources are sources where analysing the count of records is irrelevant, and analysis only has meaning when a measure is applied, e.g. scores, GRPs, Spend, etc. • Files in SAV and Dimensions format typically contain record level data with a record representing a respondent

  9. Creating sources • SAV files can be generated from a variety of data collection applications, but not all make full use of available features • X-files is Infotools’ proprietary format read by our legacy Desktop applications • Dimensions files are typically created by Dimensions, although some other market research data collection applications can also create them • Xlsx files are primarily created from Excel in Office 2007+ • Text files, delimited with a tab or comma, can be generated from a variety of systems, and do not have market research specific structures

  10. SAV Structure • Coded responses have labels • Looped questions have a separate variable for each loop combination • All respondents asked the question have a response • Respondents not asked the question don’t have any response • Multiple response sets capture yes responses • Cleaned data, with only valid respondents

  11. X-File Structure • X-files are typically already designed • Vaxes must be in indexless format (achieved using SUP -Clean) • Each item within a common type needs a unique label • Headers need to be new style, i.e. version 4+ • Matrices, yngrids and vaxes with predefines are not supported

  12. Dimensions MDD/DDF Structure • The Dimensions converter is currently undergoing an overhaul • Supported structures will be confirmed within the next few months

  13. XLSX Structure • First row contains headers with unique, non-blank descriptions • Fields contain labels, not codes. If codes are used they will need to be renamed to labels after importing into Infotools Harmoni (IH) • Key identifiers can be added to the header labels to ensure required Infotools Harmoni types are applied • No blank rows between valid rows of data • Numeric values don’t contain text, e.g. $, commas, % • No fields contain |

  14. TXT Structure • First row contains headers with unique, non-blank, descriptions that don’t include tabs • Fields are delimited with a tab • Records contain labels, not codes. If codes are used they will need to be renamed to labels after importing into IH • Key identifiers can be added to the header labels to ensure required IH types are applied • No blank rows between valid rows of data • Text containing tabs is enclosed in double-quotes • Time/Date fields contain descriptions, not numbers • Numeric values don’t contain text, e.g. $, tabs, %

  15. CSV Structure • First row contains headers with unique, non-blank, descriptions that don’t include commas • Fields are delimited with a comma • Records contain labels, not codes (if codes are used they will need to be renamed to labels after importing into IH) • Key identifiers can be added to the header labels to ensure required IH types are applied • No blank rows between valid rows of data • Text containing commas is enclosed in double-quotes • Time/Date fields contain descriptions, not numbers • Numeric values don’t contain text, e.g. $, commas, %

  16. SQL Structure • All relevant data exists in a single view or table • All data associated with a unique record is reported in a single table or view • Key identifiers can be added to field names to advise IH on the default type to import • Fields contain labels, not codes (if codes are used they will need to be renamed to labels after importing into IH) • Keep in mind that field names in SQL are limited to 128 characters, and cannot contain |

  17. Direct Connection Structure • Each Direct Connection is developed to map the data types in the data collection system with the data types available in Infotools Harmoni • Direct Connections allow market research specific structures, such as grids and multiple responses, to retain their internal relationships

  18. Source Dictionaries

  19. Source Dictionaries • A Dictionary contains meta-data to guide interpretation • When loading sources with a dictionary, Infotools Harmoni automatically maps items into Infotools Harmoni item types • When loading sources that d • on’t contain a dictionary, the user is given the option to override the automated mapping of source variables into Infotools Harmoni types • Sources with inherent dictionaries are SAV, X-files, MDD/DDF and Decipher

  20. SAV Dictionary • Variables that are a Numeric Type, and have defined Value Labels become standard axes in IH, e.g. • Variables that are a Numeric or Date Type, and have no defined Value Labels become measures in IH, e.g. • Variables that are a String Type, and don’t have defined Value Labels become text items in IH, e.g.

  21. Dictionary Tricks • If the file doesn’t have a dictionary, key identifiers can be used to pre-determine the data type in text file imports • The key identifiers are: • $ any field starting with $ becomes a measure in IH • $weight any field starting with $weight becomes a weight in IH • & any field starting with & becomes a text item in IH

  22. Tips & Tricks

  23. Source Tips & Tricks • You can see the sources in a Project that’s not loaded by selecting the project and clicking Sources • You can see the sources that exist in a loaded Project by clicking • Once in the view/add sources menu, you can add more sources to a project by clicking

  24. Source Tips & Tricks • Projects can contain multiple sources - harmonization of sources is, of course, one of our key selling points! • Infotools Harmoni is source agnostic, so sources in a project can be a mix of imported files and direct connections • Sources are specific to a Project

  25. Update Tips & Tricks • If a source is updated, or a new source is added, dictionary mappings applied to the previous source are applied to the new one • When adding new sources, items that don’t already exist in the project are added to the end • If an item does not appear with the expected label or type, Infotools Harmoni provides functionality to change these things within your Project … although sometimes it’s easier to change the source!

  26. Filetype Tips & Tricks • When using an x-files source you need to upload: • Xbf • Xdf • Xef • Xlk (optional) • Excel files must be .xlsx • Infotools Harmoni can use sources in any language, including languages requiring multi-byte storage, e.g. traditional Chinese

  27. Tips & Tricks • When importing files that do not have an associated dictionary, the elements of standard axes are written into the axis in the order that they appear within the file • Multiple response sets in SAV files appear at the bottom of the project • Data imported into SQL from Excel, Access or Text, has field names truncated at 64 characters

  28. Helpful resources • Infotools Harmoni Knowledge Base https://support.infotools.com/hc/en-us • Sample files https://support.infotools.com/hc/en-us/articles/360015447734

  29. Try it yourself

  30. Try it yourself…? • Sample files can be found in https://support.infotools.com/hc/en-us/articles/360015447734 • Try importing each of the different sources and see if you can answer the following questions

  31. Can you…? • See what’s new in the 201502 SAV file that wasn’t in 201501? • Work out why the sample size is so high in the x-files? • Import Experience feedback in the XLSX to be a verbatim in Infotools Harmoni • Add the TXT file into the same project you imported the SAV files and watch the magic happen • Take a copy of the CSV and change the source so NumBevs Per Week is recognised as a measure in the import screen?

  32. What is this data…? • The following slides show some screenshots of sources. For each data source, consider the following questions: • Is this data record level or summary level? • Is this data suitably formatted to import into Infotools Harmoni? • If not, why not?

  33. What is this data…Source 1

  34. What is this data…Source 2

  35. What is this data…Source 3

  36. What is this data…Source 4

  37. What is this data…Source 5

  38. Answers/Hints • Try it yourself • Hint: use the NEW filter at the top of the project view • Hint: a population projection has been set in the xlk • Hint: Insert a & into the start of the Experience feedback header • Hint: the colouring indicates the common and non-common items • Hint: Insert a $ into the start of the NumBevs Per Week header • What is this data • Source 1 – record level, suitably formatted • Source 2 – summary level, not suitably formatted – there are blank lines, merged cells in the headers, headings among the entries in column A, etc. • Source 3 – summary level, suitably formatted • Source 4 – record level, suitably formatted • Source 5 – summary level, suitably formatted

  39. Questions

More Related