360 likes | 491 Views
IT 456-01 System Administration Lesson 2. Dr Jeffrey A Robinson. Overview of Course. Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and Using a Database Week 4 - Using the SQL Server Agent Service Week 5 - Disaster Recovery
E N D
IT 456-01System AdministrationLesson 2 Dr Jeffrey A Robinson
Overview of Course • Week 1 – Introduction • Week 2 – Installation of SQL and management Tools • Week 3 - Creating and Using a Database • Week 4 - Using the SQL Server Agent Service • Week 5 - Disaster Recovery • Week 6 - SQL Server Security • Week 7 - Performance Monitoring • Week 8 - Performance Optimization • Week 9 - High Availability
Unit 2 • Chapter 1 introduces SQL Server 2008, listing the components, services, and editions of SQL Server 2008. It also describes how to plan and perform a SQL Server 2008 installation. Hardware and software prerequisites, server and security considerations, and the installation process are described
SQL Server major components • The five SQL Server 2008 editions available for use in production environment are • SQL Server 2008 Enterprise edition • SQL Server 2008 Standard edition • SQL Server 2008 Mobile edition • SQL Server 2008 Workgroup edition • SQL Server 2008 Web edition
SQL Server major components • Instructions in DocShare folder • IT456_-_Installing_a_Default_Instance_of_SQL_Server_20058 • SQL Server 2008 • SQL Server Management Studio (SSMS) • SQL Server Business Intelligence Development Studio !!! • Databases (AdventureWorksNorthwnd Pubs NW Traders) • Services • Analysis Services • Notification Services • Integration Services (need XP Pro or Vista Ultimate) • NOT Home or Business versions • Books online !!! • DM databases and Excel DM extentions
Unit 2 assignment • 200–300 word - Describe the requirements of MS SQL Server 2008 and the process you must follow prior to the installation of MS SQL Server 2008. (Hint - Before installing MS SQL Server 2008 an installation plan should be created…. )
Unit 2 assignment (cont) • Project – Install a default instance of SQL Server 2008. • a. For directions, see Chapter 1. • b. Create the NWTraders Database using instructions in the file in Doc Sharing called "Create NWTraders Database • c. Run the scripts from the file in DocSharing called “Create NWTraders Objects and Data.”
Unit 3 assignment • Write a 200–300 word response to each of the following topics. • 1. 10 points. Define five options used when creating a database and identify the importance of each option. • 2. 10 points. Define the options for the recovery method available when creating a database. Identify the strengths and weaknesses of each option.
Unit 3 assignment (cont) • For this Unit's Project you will: • Develop a stored procedure within the NWTraders database to evaluate characteristics of a customer and his/her purchases or employee and the purchases he/she have been involved in the procurement. Examples include the evaluation of purchases by customer by region, zip code with items or evaluation of employees by title, birthdate, region or zipcode.
Next week • Week 3 - SQL Server Configuration and Internals • Important stuff on TLOGs
Excel - Add ins • To enable the full resources of Excel you need to be sure that the appropriate Add-in are enables in excel • In Office 2007 this is done by clicking on the • Office icon in the upper left of the Excel window • Then select the Excel Options • button on the bottom of the • Popup menu • Click n Add-Ins on the left • Navigation bar and OK on the • Next pop-up menu
Excel – Add-ins • You can then select which add-ins to select or enable. Make sure you enable the Analysis ToolPak and the Solver Add-in in particular. • Depending on additional packages or applications you have purchased, there may be additional add-ins that you can enable and use. • (Some add-ins require you to restart Excel or even reboot depending on your version and operating system)
Excel Statistical Add-ins • Once installed these new utilities will show up on the Excel toolbar ribbon under the Data Tab • Third party add-ins may show up as new tabs on the toolbar ribbon or be listed under the Add-In tab
Excel Add-Ins • Once installed a much wider suite of statistical tools will be enabled in Excel, including: • Anova: Single Factor, Two-Factor with or without Replication • Correlation and Covariance • Descriptive Statistics • Summary statistics Select if you want Excel to produce one field for each of the following statistics in the output table: Mean, Standard Error (of the mean), Median, Mode, Standard Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum, Count, Largest (#), Smallest (#), and Confidence Level. • Exponential Smoothing • F-Test Two-Sample for Variances • Fourier Analysis • Histogram • Moving Average • Random Number Generation • Rank and Percentile • Regression (for linear fits and normal probability plots) • Sampling • t-Test: Paired Two Sample For Means • t-Test: Two-Sample Assuming Equal Variances • t-Test: Two-Sample Assuming Unequal Variances • z-Test
Excel 2007 changes • Excel has undergone a number of changes in the 2007 Office version. While we have discussed a few, here is a partial list of some more important changes.
Excel Analyze tools Add-in • The Excel Add-in comes with a number of excellent analysis features. Coupled with the Analysis Services add-in Excel becomes a powerful entry level data analysis tool. • Additionally, under the ANALYZE tab, there are features such as: - Analyze Key Influencers - Detect categories - Fill from Example - Forecast - Highlight Exceptions - and Scenario Analysis
Excel Analyze • Once you are connected, the following features will be enabled in the ANALYZE tab of your Excel workbook
Excel Analyze Add-in • The Analyze Key Influencers tool lets you select a column that contains a desired outcome or target value, and then analyze the patterns in a data set to determine which factors had the strongest influence on the outcome. For example, if you have a customer list that includes a column that shows the total purchases for each customer over the past year, you could analyze the table to determine the customer demographics for your top purchasers. • The tool also lets you select a pair of possible outcomes and compare them. If the column that you are predicting contains discrete values such as "Purchase = Yes", "Purchase = No", or "Purchase = Yes but not now", you can compare the consumers who purchase immediately against those who purchase later, and those who purchase later against those who never purchase. • This helps determine possible decision-making factors.
Excel Analyze Add-in • The Detect Categories tool automatically finds rows in a table that have similar characteristics. • When the tool finishes, it creates a report that lists the categories it found, together with their distinguishing characteristics. By default it adds a new column to the data table that contains the proposed category. You can then review the categories and rename them.
Excel DM Add-in • The features of the Excel DM add-in suite include: - Explore data - Partition Data - Clean data - Estimate - Cluster - Associate - Forecast • We will examine some of these in more detail in labs following this lecture.
Excel DM Add-in Explore data • The Explore Data wizard lets you preview your data, and gather statistical information that is useful in cleanup or planning the data modeling phase Partition data • The Partition Data wizard is a fast, easy way to separate data into training and testing sets. It includes sampling options such as random sampling and oversampling.
Excel DM Add-in Clean data • There are two Clean Data wizards, the Outliers wizard and the Relabel wizard. These wizards help simplify data cleaning prior to data mining. • Another tool that is useful in data cleaning is the Fill From Example tool, in the Table Analysis Tools for Excel. This tool detects missing values and lets you either remove them or replace them with a mean, null, or other value. • The Remove Outliers wizard graphs the distribution of values in your data and lets you remove extreme values. • The Relabel Data wizard lets you work with the data values to make them easier to read and understand. For example, you can replace numeric values with text labels, or assign a group label to continuous data ranges.
Excel DM Add-in • Classify • The Classify Data wizard builds a classification model that predicts the values of one column based on values in other columns in the model. It uses the Microsoft Decision Trees algorithm, which is a classification and regression algorithm used for predictive modeling of both discrete and continuous attributes. • Estimate • The Estimate wizard builds an estimation model that extracts data patterns and uses the patterns to predict continuous numeric, date, or time values. It uses the Microsoft Decision Trees algorithm. • Cluster • The Cluster wizard builds a clustering model that detects groups of rows that share similar characteristics.
Excel DM Add-in Associate • The Associate wizard builds an association model that detects associations between items that appear in multiple transactions: for example, in market basket analysis. Forecast • The Forecast wizard builds a forecasting model that detects patterns in a series of cells, and then forecasts additional values. Advanced • Create mining Model • Add Model to Structure
Accuracy and Validation Accuracy Chart • The Accuracy Chart wizard helps you create a prediction query and assess the performance of a data mining model by creating a lift chart or scatter plot chart. The lift chart is important because it helps distinguish between models in a structure that are almost the same, to help you determine which model provides the best predictions.
Accuracy and Validation Classification Matrix • The Classification Matrix wizard helps you create a prediction query to assess the performance of a classification model. The output is a chart that summarizes both accurate and inaccurate predictions made by the model. The matrix is a valuable tool because it not only shows how frequently the model correctly predicted a value, but also shows which values the model most frequently predicted incorrectly. Profit Chart • The Profit Chart wizard helps you determine the costs and benefits of using a data mining model. It assesses the performance of a prediction, but also incorporates the cost of taking action based on the prediction, and estimates the return on investment
Enabling the Excel DM Suite • First install the tool bar (the same way the Stats Add-in was installed) • Then enable the add-in by clicking START > ALL PROGRAMS > SQL Server 2005 DM Addins > Sever Configuration Utility • Enable Temporary Session mining Models (these are objects that are created by SQL Server and Excel and are deleted after use. • Allow users to create new databases (unless you want to used an existing SQL Server database; finally enable permissions for users to create, modify and delete the objects in this DB. (The default DB name is DMAddinsDB)
Excel DM Suite • Setting up Data Analysis tables • Once SQL server is configured to enable Excel DM add-ins, open your Excel workbook. • Select a worksheet and highlight the data you want to convert to an Excel table (as shown on the previous slides on Excel tables; HOME > STYLES > FORMAT AS TABLE) • Next, connect to the server and database that you previously enabled. (On the ribbon: ANALYZE > CONNECTION > Add New Connection) Enter the server name and the local database that you specified when running the configuration utility. The connection to the SQL Server Analysis Services Utilities will be displayed in the Excel Ribbon
Excel DM - Addins • However, with DM features enabled, you also get the following new tools, as well. (Under the Data Mining Tab) • Notice: there are now many tools are your disposal. • However, you must first indicate the data you want to analyze. The method for selecting data is the same for all the DM tools, but I will use the classify tool as an example.
Excel DM - Addins • Click on the classify data icon to access the Classify data Wizard. The defaults are the Excel tables in your workbook or you and enter a range of data within a table. You can, however link directly to existing SQL Server databases to access tables, columns and rows of data in existing relational databases
Excel DM - Addins • To link to an analysis services data source you need to enter the Data Source name, Server name and catalog name. Once entered, you will be able to access all of the data in that data source by simply clicking on the field or columns that you want to import to Excel.
Excel DM - Addins • The Data Source Editor will automatically generate an SQL query to pull the specified data from the SQL Server database for use. You can accept the default SQL Query or edit it manually, if you wish. Once loaded you can use the various DM tools, but these tools will only work if the data has already been imported to excel.
Excel DM – Classification Tree • By simply selecting the default table in the sample database provided with the DM Addins (DMAddins_SampleData.xlsx) Source Data tab, the classification tool completes the creation of a classification tree (shown here)
Excel DM – Explore Data • Another useful initial tool in the DM suite is EXPLORE DATA. Again selecting data you get immediate results. • You can automatically • Generate distribution • histograms, with • a variable number of • buckets for numeric data • and a sorted Pareto chart • for non-numeric data. • This is orders of magnitude • easier than pivot tables • and histogram generation • using the Analysis add-in • tool