170 likes | 185 Views
Creating Datasets & Using Data Flows. Presented by Andrea Vega, UI/UX Designer and Engineer. What is a Data set?. Query result that sticks around Refresh any time, just as fast as a query Doesn’t matter where the data came from Can be shared
E N D
Creating Datasets & Using Data Flows Presented by Andrea Vega, UI/UX Designer and Engineer
What is a Dataset? • Query result that sticks around • Refresh any time, just as fast as a query • Doesn’t matter where the data came from • Can be shared • FAST filtering, without hitting the production database again • Use them in Reports, Dashboards, and Jobs
What is a Data Flow? A Data Flow is a series of steps that do really cool things to your data on its way into a Dataset.
It’s Fast • Data streams into the Dataset • First record is added before the last one leaves the Datasource • No need to load the whole thing at once • Size of Dataset does not change the rate (records per second)
Examples • Add fields from another Dataset or Datasource • Merge duplicates • Calculate the days between two dates • Set a default value for a column • Make your own calculations • Remove rows and columns
Demo start • Start with a new Dataset Query Designer lets you build a Dataset that is backed by a Datasource Start by picking Northwind Datasource and the “Orders” Mapping Use the Field Chooser to pick out fields Easily navigate related Mappings using the link tree
Add some flow steps • Concatenate Combine first and last name to make “Salesperson” • Remove fields Get rid of first and last name now that we don’t need them anymore • Calculated field Get the total order amount • Fields from another Dataset Product category
Order matters • Flow steps run one after the other • You can drag and drop them to change the order
Transformations • Merge duplicates One row per person instead of one row per order. Multiple orders get turned into arrays (list of values) • Normalize Opposite of merge duplicates Array of orders turns into one row per order Person information is duplicated on each row
A few more • Time Between Find the days between order and shipment dates • Calculated field Distinguish between late and on time orders • Add fields from another Datasource Get continent and region for the destination country from the World Datasource • Find/Replace Get the country name the World Datasource expects for USA Now change the order of the flow so this step happens before the remote join!
Fun with locations • Zip2Geo Get latitude and longitude estimate from a zip code Now look at discover, and see the map! • Template field Show a fully formatted address • Geocodio Get a precise location from the address Now the map is no longer approximate, you can see street view!
Flushing • Sometimes all rows have to be seen before the row can be finished • For example • Two-pass calculations (e.g. percent of total) • Merge duplicates • This can slow down the query, but is sometimes necessary • Coming soon: “Ordered merge duplicates” that assumes the records are already sorted. Does not need to flush.
Power Script • Full access to each record, and across records • Create multiple fields at once • Modify field definitions • Omit records based on some condition • Access input parameters • Access to utilities (lodash, moment) • Access to your own custom functions
Power Script $omit() : Omit row $record : Values in row $local : Persists across rows $fields : Field info $inputs : Inputs info $query : Query info $datasource: Datasource info $index : Row number _ : Collection manipulation moment : Date manipulation See a link to the help center article about these options in Informer
Catch these sessions for next steps • Teams and User Access – Governing Your Data • Share Datasets • Interact, Analyze and Discover Your Data • Use filters • Discover and save visuals