180 likes | 281 Views
ICOLC Use Data Webinar Episode 2. Tansy Matthews July 2010. Text Files: Formatting the files. Download Combine files into one Excel workbook. Recommended: Excel Import Multiple Text Files from SobolSoft ( http://www.sobolsoft.com/exceltext/ )
E N D
ICOLC Use Data WebinarEpisode 2 Tansy Matthews July 2010
Text Files: Formatting the files • Download • Combine files into one Excel workbook. • Recommended: Excel Import Multiple Text Files from SobolSoft (http://www.sobolsoft.com/exceltext/) • Run appropriate macro from counter_processing.xlsm. There are different macros for JR1 and DB1 reports. The macro for the JR1 is also used for the e-book reports.
Text files: Customizing the file Delete all months except for the month you are processing. IMPORTANT!!! Data must be processed one month at a time. The date has to appear on every line, otherwise the data cannot be concatenated over time. Add a column for Date. Enter the date in this column. Add custom columns. For example, institution name, institution type, etc. Using lookup tables will make this easier.
Text files: Saving the data Once the report has been reformatted into columnar data, it can be imported directly into a database table or stored in a “Master” spreadsheet. I recommend using a table in a database – even the million rows in Office 2007 fill up very rapidly.
XML files (so much easier) • Options • CR1 • Individual JR1s • Making the choice • CR1s are very large. They can take a long time to download, if they are emailed they can be blocked by attachment size limits, and trying to load them into a spreadsheet can overwhelm the CPU (I have a pretty fast computer and the CR1s crash it on a regular basis. • Individual JR1s can be a significant effort to download.
XML files (a glimpse at the future) We’ll talk more about this in the last session, but COUNTER is working on an alternative to the CR1. We’ve been testing a tool that will automate the downloading of all of a consortium’s JR1s.
XML files (back to our regularly scheduled program) • Setting up the loading file • In a perfect world, this would only have to be done once. This is not a perfect world, so you may have to make two versions. Or possibly even three. Which is definitely NOT the end of the imperfect world. • Download an example XML file. • Open a new Excel file. • Click Developer. In the XML area, click Import. Select the example file. • Select the fields you want to keep. • Delete everything else. • Save the file.
Setting up the loading file – an alternative • Use mine: • counter_xml_loading.xlsx (for JR1) • cr1_loading_name.xlsx (for most CR1s) • cr1_loading.xlsx (for a few CR1s where the fields are a bit odd, Nature, for example) • All of these files can be downloaded at http://www.vivalib.org/stats_training/index.html
XML files continued Right-click on the XML table. Select XML>XML Map Properties. Notice that the default under “When refreshing or importing data” is “Overwrite”. This is a good thing, but it’s a little scary the first time you see all of your data disappear. Right-click again. Select XML>XML Source This shows the structure that underlies the table.
XML files continued Right-click again. Select XML>XML Source This shows the structure that underlies the table. You can always add fields back in. One of the beauties of the XML table is that you CANNOT loose data. It’s just showing what’s in the underlying files.
XML files continued. Loading the data. Right click on the XML table. Select XML>Import Select all of the files that you want to import. They don’t all have to be from the same vendor. Click OK. If you get an error message, that means that one of the vendors is defining their XML schema slightly differently and they will need a different loading file. I have two slight variations for the CR1.
XML files – customizing the data It works in just the same way as the text files. Add the designations you need.
Getting the data into a database • Question: • Should we go over how to set up a data table in Access?
Getting the data into a database Since we load quarterly, I use a PivotTable to organize the data for loading. You can set up your own but it’s probably easier to use mine as a model: (http://www.vivalib.org/stats_training/index.html). Click on the PivotTable. Under PivotTable Tools, select Change Data Source. Browse to the file from which you wish to load the data.
Getting the data into a database Once the data is loaded, select the PivotTable. On the second page in the worksheet, right-click and use “Paste Special” to paste the values. Change the “Total” heading to “Requests” Go down to the end of the data. Type END in each of the first three columns. Select Developer>Macros and run the macro that’s included in the workbook. Voila! Exactly the data you want to load.
Getting the data into a database Open Access. Select External Data>Import>Excel. Specify the source. Select the table where the data is stored. The data will be imported.