1 / 26

Using the Download Tool

Using the Download Tool. Dallas Alteryx Users Group Brian Purcell March 2019. Download Tool. Connectors Pallet Use Cases FTP/SFTP Transfers APPLICATION PROGRAMMING INTERFACES Web Scraping Programming Languages Not Required Then Leverage Other Alteryx Processes.

moniquem
Download Presentation

Using the Download Tool

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using the Download Tool Dallas Alteryx Users Group Brian Purcell March 2019

  2. Download Tool • Connectors Pallet • Use Cases • FTP/SFTP Transfers • APPLICATION PROGRAMMING INTERFACES • Web Scraping • Programming Languages Not Required • Then Leverage Other Alteryx Processes

  3. Anatomy of a Download Workflow • Text Input (Or Formula) – URL String • Other Fields Could be Contained—Depending on Use Case • Download Tool • Check Configuration • Parsing / Shaping Tools • JSON Parse, RegEX, XML, Text to Columns, etc.

  4. FtP Download Workflow • External Vendor Provides Program Information • The File is Uploaded on a Nightly Basis • Filename has An Appended Date • A Destination Path is Defined and Configured • The Job Runs Daily on Our Company’s Alteryx Server

  5. FTP Download Configuration

  6. Show Census FTPAlteryx Job

  7. Purpose – Webscraping / API • Enrich Existing Data • Easily Acquire External Data • Obtain Competitive Information • Automate Mundane Tasks

  8. Differences API Webscraping Intended for Browser Traffic More Information Available Than API’s Most sites don’t offer an API Unstructured Text Output Gray Area – Legally Consult Your Legal Resource • Intended for Data Extraction • Often Requires An Application, Approval, and Token • May Require A Subscription or Per Use Fee • Default Outputs – JSON/XML

  9. API—Application Programming interface • Websites Optimized For Data Transfer • The “under the hood” of Connectivity • Think of the Apps that Use Google Maps Realtor.com Likely Uses Google’s API within their Mobile App

  10. Webscraping Browser View Alteryx View / PageSource Right Click in Browser, view Source

  11. Hardest Part of WebScraping???

  12. Show March MadnessAlteryx Job

  13. API’SApplication Programming Interface • Commonly Used for Mobile Applications • Methods – Get, Post, Put, Delete • URL EndPoint • Parameters • Key / Token

  14. The process • Find the Registration Portal • Submit an Application, Store your Key • Review API Documentation • Build A Basic Query • Parse • Pagenate (if needed)

  15. NOAA • https://www.ncdc.noaa.gov/cdo-web/webservices/v2 • 1527 Different Data Report Categories • Temperature, Wind, Precip • Soil Temp, Degree Days, Cloud Coverage • 9,563 Stations • 6,340 in the US • 361 in Texas

  16. GHCN - Global Historical Climatology Network report Example of Data However, not usable for data analysis– embedded in a pdf We prefer a structured data table. Which will allow us to enrich our data by appending weather conditions to internal data (examples) -- Weather driven industries – HVAC, lawn care, utilities -- Logistics – snow/heavy rain on shipping

  17. Pick One

  18. “Parameter Name”=“Value” • Join Multiple Parameters with “&” • Some are Required, Others are Optional • Authentication could be handled through a Parameter

  19. Authentication • Nearly All API’s Require An Account • Commonly Seen Methods • Header – Documentation should identify the Header Name • Parameter Value – Inserted Along with Other Parameters • Basic – Use the User Name & Password Tab on Download Configuration • Very Hard – Two Step, Authentication API Call, Then 2ndApi Call • Twitter, Tableau

  20. Show NOAA Alteryx Example

  21. Pagenation • API’s Typically Limit Data Results To Maintain Bandwidth • Alter Your URL Request • Offset, RowCount– Common Parameters To Return The Next Page of Results • How to Handle • Most API’s will Return a Max Count • Reconfigure your API to send a API Request for Each Page (using offset) • Advanced--Create An Iterative Macro

  22. Show Twitter Alteryx Example

  23. Sometimes You’ll Get Lucky • API Explorer/Debugger • Azure / Qualtrics • Simulates HTML Request with Endpoints, Parameters, and Headers • https://dev.applicationinsights.io/apiexplorer/metrics?appId=DEMO_APP&apiKey=DEMO_KEY • Plug it into Alteryx

  24. Tips • Check URLEncode • Limit URL Requests • Cache & Run • Save Output to YXDB • Use the Throttle Tool • Test Your RegEx First, Then Scape • Review HTML Response Codes • Often Will Indicate Error • Token • Exceeded Allotted Records • Improper Query

  25. Possible Use Cases • Competitive Analysis • Location, Product Changes, Pricing • Automated Form Collection • Create a Survey, Alteryx Workflow to Transform The Data • Google API’s • Geocoding, Street View Images, Turn By Turn Directions

  26. Resources • FTP Knowledge Bases Articles • https://community.alteryx.com/t5/Alteryx-Knowledge-Base/File-Transfer-Protocol-FTP-Download/ta-p/10599 • https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Automating-File-Transfer-Protocol-FTP-Downloads/ta-p/19186 • https://community.alteryx.com/t5/Alteryx-Knowledge-Base/Upload-to-SFTP-FTP/ta-p/20845 (Upload) • https://pages.alteryx.com/Javelin-Webinar-Series-Master.html (Great Webscraping & API Examples) • List Of API’s • https://www.programmableweb.com/ • RegExTesters • https://regexr.com/ • The RegEx Coach – Free Program • Free Software • Postman – Test API Calls • Fiddler – • Useful for Webscraping • Track Your Computer’s HTML Requests, Incl. URL’s, Headers, HTML Request Type • Find Me on LinkedIn • https://www.linkedin.com/in/brian-purcell-2000/

More Related