410 likes | 465 Views
Explore advanced data integration strategies in Excel to optimize data handling, access external sources, and generate insightful reports. Enhance productivity through efficient data capture and management within Excel.
E N D
Data Integration in Excel Dr. Nitin Paranjape MVP (Office System) Chairman and MD, Maestros
Version information • Excel 2007 for all demos • Many features exist in 2003 as well • Keep Excel open to understand better • But, don’t try to do what I am showing • Using Beta version • Complicated process of broadcasting voice • Please bear with any crashes, disconnections
Objectives • Understand available integration options • Mapping business scenarios to available features • Understanding strengths and limitations of data handling
POLL • How many of you have used Excel for data handling in your applications?
Data integration areas • Data capture within Excel • Getting external data into Excel • Accessing Excel data from outside • By Exporting it to another format • By keeping it within Excel • Processing data within Excel • Using Excel as a report writer
Primary purpose of spreadsheet is to analyze data. NOT to store data.
Data sources • ODBC / OLEDB • SPS list • XML • OLAP • Text • Query files
Import tools • Wizard (only ODBC) • MS Query (only ODBC) • Text import • Text import VBA code • Data connection wizard
Query files and connections • This facility has been available for years • It works on ODBC sources only • Uses either a wizard or MS Query • Demo (Wizard and MS Query)
Text Import • Highly complex text import possible • Fixed width or Delimited • Multiple delimiters supported • Decide data types while importing • Manage multiple delimiters • Manage preceding negative signs • Demo
Text import programmatically! With ActiveSheet.QueryTables.Add(Connection:="TEXT;C:\temp\test.txt", _ Destination:=Range("$A$1")) .Name = "test" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .TextFilePromptOnRefresh = False .TextFilePlatform = 437 .TextFileStartRow = 1 .TextFileParseType = xlDelimited .TextFileTextQualifier = xlTextQualifierDoubleQuote .TextFileConsecutiveDelimiter = False .TextFileTabDelimiter = True .TextFileSemicolonDelimiter = False .TextFileCommaDelimiter = True .TextFileSpaceDelimiter = False .TextFileColumnDataTypes = Array(1, 4, 3, 9, 1) .TextFileTrailingMinusNumbers = True .Refresh BackgroundQuery:=False End With
Importing ODBC vs OLEDB • Import External Data = OLEDB • New database query = ODBC • Web query = import of HTML tables from web sites • More options added in 2007 • SQL Server direct • XML • Access • Analysis services
Connections dialog in Excel 2007 • This is a major improvement • You can see all available Existing connections in a single place • You can filter local, network and file based connections in one dialog • This helps you reuse any past connections quickly • Demo
Deploying connections across an organization • You can store Excel files in a special library • Connection library in SharePoint • Users can point to a SPS site and ask for connection information • This ELIMINATES the need to deploy / send query files to multiple desktops • It also provides centralized access control
Capturing data within Excel • Very commonly used by end users • Creates lots of problems • Accuracy, validation, structure is missing • Users send files to each other creating confusion and multiple copies
Native features of Excel for capturing data • Linear data to auto-form : Demo (Ccard.xls) • Validations : Demo • Forms & Control Toolbox controls • User based editing : Demo
Understanding the importance of Excel List • We work on blocks of data in Excel • When more data is added, formulas DO NOT refresh automatically • This can lead to lot of errors • List corrects that problem • In addition, it integrates with SPS • In 2003, it is two way sync • In 2007, it is one way (from SPS to Excel) • Demo
Capturing data in SPS list • Better alternative • Keeps data central • Users can update individually • Shows only their own data • No Manual consolidation required • DEMO
Exporting data • Supported formats • Delimited • XML spreadsheet • XML • Sounds like a limitation? • How do you pick up data from Excel and put it directly into your database? • How do you use it with ETL tools like DTS?
Excel as ODBC source • Excel ODBC driver has been available for years • Create a named range • Create ODBC source • The Excel file is considered the database • And Named ranges become tables • Now it can be used in any tool which supports ODBC. Including Excel! • DEMO
Report generation using Excel • Connect to data source • Process the data • Analyze data • Render as report • Finally let user manipulate the read only snapshot of report in Excel
Report generation options • Raw data with formulas and custom formatting • You make the report yourself by adding data from a recordset into Excel • Use SQL Reporting services to render it as Excel • Pivot table : Demo • Create a base view of report • Let users do further analysis in a flexible way
Eliminating the row limit of Excel • The 65000+ limit of Excel is a misconception • There is no need to get data in Excel • Most want to do analysis using Pivot Table • Pivot table can be created DIRECTLY • In this case, there is no practical limit on the number of rows • The Pivot Cache is available OFFLINE automatically. • Pivot drag drop does NOT fire database queries • Only REFRESH button fires database level re-query
Pivot Table and dashboards • Multiple copies of Pivot tables showing different views of data • Multiple pivot tables from different sources • Pivot Tables and Pivot Graphs combined
Importance of learning Pivot Table • Extremely powerful • Eliminates complex SQL code • Users can change format anytime – eliminating further custom code • Data connection libraries + AD security + pivot table is a very secure, zero code solution for reporting
GetPivotData function • Initially it is confusing • For Pivot based calculations outside the table • Very useful when you are managing Pivot Tales programmatically • In this case you cant’ see the Row, Column coordinates to put formulas • How to use it?: Demo
OLAP reporting in Excel • 2003 was primitive support • 2007 provides better support • Many more cube functions added • The User interface is native and fast • Office web components are also more responsive even if the data is large
Consolidation • Very powerful when data comes with Row and Column headings (Cross tab format) • Can be used programmatically • Saves lots and lots of coding • Results in Pivot Table: Demo
XML • XML based, open, Excel file format • Server side data crunching without Excel possible • For Excel functions, use Excel Services • Custom schema based editing, validation and import export • Custom actions and element sensitive Task Pane UI using .NET code attached to XML schema (Smart Documents)
Finally Excel Services based reporting and data capture • Renders Excel data as pure web page in a secure manner • Create Excel sheet. Name required ranges. • Post it in SPS library • Configure library to use Excel services • Now users can view Excel as a simple web page • Interactivity is maintained • In-line editing is NOT possible • Specific, named cell values can be captured • Demo
Usage scenarios • Interactive • Programmatic
Interactive • Complex delimited text file imports • Consolidation
Programmatic • Data capture forms / surveys • Reporting from OLTP / OLAP with end user level report format editing capabilities • Dashboards • SQL reporting services – render as Excel by default • SPS Lists based reporting • SPS List – Business Data Catalog – Excel reporting • Enterprise wide data connections in SPS lists • Office web component – PivotTable to deliver reports on Web
Summary • Learn Excel first • There are many ways in which Excel can be used for data capture, processing and reporting • Map the application business needs to the appropriate data handling method • This will make users happy and will eliminate lots of complex code we write everyday
References • Old interface to new interface http://go.microsoft.com/?linkid=5174798 • Excel cell referencing (very useful while coding in VBA)http://www.expresscomputeronline.com/20021216/techspace1.shtmlhttp://www.expresscomputeronline.com/20021223/techspace1.shtml • Excel Help!
Thank you nitin@maestros.net www.nitinparanjape.com/blog Learn and Grow
Quiz • This is not a POLL • You have to send your answers to contact@erfolgcs.com • Format should be • 1-A, 2-B and so on
Question 1: What is the number of rows available in Excel 2007? • 65365 • 1.5 million • 1.1 million • 1.04 million
Question 2: In Office 2007, the data How does the data synchronization between Excel 2007 Table and SPS 2007 List work? • Changes in SPS change data in Excel table • Changes made in Excel change data in SPS • Changes can be bi-directional • No changes possible. It is a snapshot
Question 3:Technology used to expose business data in external databases in SharePoint 2007 is called… • Excel Services • Business Data Catalog • Data Connection Library • None of the above
Question 4:Which type of data Excel CAN NOT import? • DBF • SYLK • Visio • MDB