600 likes | 941 Views
Welcome!!!. OPERATIONS RESEARCH SYSTEMS ANALYSIS TECHNOLOGIES AND APPLICATIONS COURSE (ORSA TAC). Course Director: John Zauner (804) 765-4256, DSN 539 john.zauner@us.army.mil
E N D
Welcome!!! OPERATIONS RESEARCH SYSTEMS ANALYSIS TECHNOLOGIES AND APPLICATIONS COURSE (ORSA TAC) Course Director: John Zauner (804) 765-4256, DSN 539 john.zauner@us.army.mil US Army Logistics University (ALU)College of Professional & Continuing EducationDepartment of Systems Engineering (DSE)Fort Lee, VA ORSA Courses offered by ALU/CPCE/DSE
ORSA TAC Course Objectives …to provide previously trained ORSA analysts with an exposure to some of the emerging technologies and applications of operations research related to military problem solving
Why focus on spreadsheet technology? • Availability • General knowledge by most analysts • Greater understanding by decision makers • Enhanced capability with add-in packages • Great for quick analysis/small to mid-size models
Course Materials • Course based on material from textbooks Practical Management Science and Managerial Statistics (Winston, Albright, Zappe) • Example based learning • EXCEL workbooks included • Covers more than we can cover in a one week class • Use of Decision Tools Suite software • @Risk, Stat Tools, PrecisionTree, TopRank plus others • http://www.palisade.com/decisiontools_suite/default.asp • Student CD contains: • All course slides and spreadsheet examples used in class • StatPro and Solver Table add-in packages plus others
ORSA TAC Schedule • Contains daily schedule • Contains instructor contact information • Contains subject breakout, textbook reference and examples • Contains links to all slides and spreadsheet examples
ORSA PROCESS“Strategy for a successful study” Define the problem Develop the analysis plan Gather / review data Crunch the numbers Analyze the results Prove the answer Package the results STUDY PLANNING (40%) ORSA TOOLS (20%) STUDY EXECUTION (40%)
Spreadsheet Modeling Fundamentals Finding a Breakeven Pointat Great Threads
Background InformationBreakeven Analysis at Great Threads • The Great Threads Company sells hand-knit sweaters. They want to print a product brochure for a direct mail campaign. • The costs of printing, mailing, and materials are provided. The average size of a customer order is $40, and the company’s variable cost per order averages around 80% of the order’s value. • The company plans to mail 100,000 catalogs. It wants to develop a spreadsheet model to answer the following questions: • How does a change in the response rate affect profit? (answer) • For what response rate does a company break even? (answer) • If the company estimates a response rate of 3%, should it proceed with the mailing? (answer) • How does the presence of uncertainty affect the usefulness of the model? (answer)
Basic Modeling Tenants Breakeven Analysis at Great Threads • Note the clear layout of the model • The input cells are outlined and shaded and separated from the outputs • Headings clearly define cells (bold, indented) • Formulas use cell or range name reference (very few numbers in formulas) • Appropriate numeric formats • Range names listed in text boxes to the right
Creating the ModelBreakeven Analysis at Great Threads • To create this model, proceed through the following steps: • Enter heading and range names • Enter input values • Model the responses (enter an estimate in the Response Rate cell – we’ll use 8%) • Model the revenues, costs and profits
Answering the Questions Breakeven Analysis at Great Threads • Now that a basic model has been created, we can answer the questions posed by the company. • For question 1, we form a data table to show how profit varies with the response rate. The table is shown here.
Creating a Data TableBreakeven Analysis at Great Threads • Enter a series of response rates (1% to 10%) in col A, and “link” them to the profit in cell B18 (=Profit) • Highlight the entire range (A18:B28) and select the Data Table option from the What If Analysis Icon (Data Tools Group on Data Ribbon) • Put cell $E$4 (ResponseRate) as the column input cell • The data table reports the change in profit as the response rate changes • Create a scatterplot (XY chart) of the values in the data table to provide a visual of the results
Answering the Questions Breakeven Analysis at Great Threads • Profit increases in a linear manner as response rate varies. A 1% increase in the response rate always increased profit by $7800. • Profit goes from negative to positive when the response rate is between 5% and 6%. • Use GoalSeek to answer Question 2 about the exact breakeven point • Goal Seek is useful for solving a single equation in a single unknown
Answering the Questions Breakeven Analysis at Great Threads • To implement Goal Seek, select Goal Seek from the What If Analysis Icon (set Profit to 0 by changing the Response Rate) • If the response rate is 5.77%, Great Threads breaks even • Question 3 asks if the company should proceed with the mailing if the response rate is only 3%. The short term view says “no”, but we may want to go ahead anyways. • Many customers who respond to direct mail will reorder in the future. If each of the respondents ordered two more times, the company would earn (3000)($7.80)(2) = $46,800 more than appears in the model, and profit would then be positive. • The moral is to look at the long-term impact of our decisions (which means a more complex model)
Answering the Questions Breakeven Analysis at Great Threads • Finally, question 4 asks about the impact of uncertainty in the model (all inputs are not known with certainty) • For example, a high degree of uncertainty about model inputs makes it difficult to talk about the profit level or the breakeven response rate. • It makes more sense to talk about the probability that profit will have a certain value or the probability that the company will break even.
Cleaning Input Data Cleaning Customer Data
Data CleaningCleaning Customer Data • Lists data on 1500 customers of a company • How much of this data set is usable? How much needs to be cleaned? • It is the analyst’s responsibility to correct data problems before doing any serious analysis (data cleaning)
Cleaning the DataCleaning Customer Data • All of the problems in this data are typical of real world data sets. • In cleaning this data we will look at the following variables: • Social Security Number(SSN) • Birthdate/Age • Region • CredCardUser • Income • Purchases and AmtSpent
Finding Duplicate Entries Cleaning Customer Data • For SSN, all 1500 customers are distinct people, so all 1500 SSNs should be different. • One of the simplest ways to check this is to sort on the SSN column • To check whether two adjacent SSN are identical, add formula IF(B5=B4,1,0) in cell J5 and copy down • Then enter the formula =SUM(J5:J1503) in cell J4 (you should find two duplicates) • Use Edit/Find menu item (set Look in: to “Values”) to find the exact records with duplicate SSNs
Cleaning Up DatesCleaning Customer Data • Sorting on birth date finds that the first 18 customers all have birthdate 05/17/27 • Excel’s dates are stored internally as integers • Select the 18 birth dates and use the Format/Cells menu item with the number option(and zero decimals) • The dates format to 9999, the “code” many analysts use for missing values • If the birth date is important for further analysis, these 18 customers should probably be deleted form the data set.
Cleaning Up DatesCleaning Customer Data • Sorting on age shows the first twelve customers with negative ages (all were born before 1930) • EXCEL assumes that any two-digit year from 00 to 29 is the 21st century; years from 30 to 99 are the 20th century • Fix by selecting the 12 birth dates , select Edit/Replace, set find what: to /202, set replace with: to /192 , click on “Replace All”
Fixing Text LabelsCleaning Customer Data • Sort on Region does not seem to work correctly • After a close look we will find that the top labels in these cells begin with a space (Fix with Edit/Replace) • For the CredCard user column (1 means user, 0 is non-user), taking an average gives a #DIV/0! Error • The problem is that the numbers were somehow input or transformed into text labels • Two ways to convert to numbers: • Put a 1 in an empty cell (make sure entered as number), select Edit/Copy, select cells to convert, select Edit/Paste Special and choose multiply operation (prior to Excel 2002) • Highlight range, click on Error Indicator (exclamation point in upper left corner of range), select convert to number
Examining Numeric Data Cleaning Customer Data • Sorting the Income column shows most values from $20,000 to $160,000 • Some values are much smaller (may have entered without 000) • Some values are 9999 (many times a code for a missing value) • Sorting by Purchases also shows 9999 for the records at the bottom for Purchases and Amount Spent (again probably a missing value code) • Put a blank row between records where 9999 starts for Purchases to separate from other records
Examining Data Relationships Cleaning Customer Data • Presumably there is a relationship between Purchases and Amount Spent • A scatterplot shows an upward trend with some outliers at the bottom of the plot • The outliers could represent the average amount spent per purchase instead of the total amount spent • If this is true, sort on AmtSpent and Purchases and multiply the AmtSpent by the number of purchases for the outlier values
Using EXCEL’s AutoFilter and Advanced Filter Querying Customer Recordsat Hytex
Background InformationQuerying Customer Records at HyTex • 1000 customers records of HyTex (direct marketing firm) • Data collected: • Age: (1 = 30 or younger, 2 = 31 to 55, 3 = 56 and older) • Gender: (1= males, 2 = females) • OwnHome: (1 = customer owns a home, 2 = don’t own home) • Married: (1 = customer currently married, 2 = not married) • Close: (1 = lives near shops that sells similar items, 2 = not close) • Salary: combined annual salary of customer and spouse (if any) • Children: number of children living with customer • History: (“NA” = no dealings with company in prior years, 1 = low spending customer, 2 = medium-spending, 3 = high-spending) • Catalogs: Number of catalogs sent to the customer this year • AmountSpent: Total amount of purchases by customer this year
Excel’s AutoFilter Tool • Excel’s autofilter tool performs simple queries on an existing Excel database easily • Terms that we will be used: • Database: a set of related data (each column is a field, each row is a record) • Top row contains field names • No empty rows • Query: a command that asks for the subset of a database that satisfies specified conditions
Using AutoFilterQuerying Customer Records at HyTex • To use the autofilter tool, make sure the cursor is anywhere within the database, and select the Filter Icon (on Data and Home Ribbons) • Click on the dropdown arrows next to a field name to view a list of choices • Choices are available to sort and filter (text or number) • Click on any value in any fields’ dropdown list to see only the records where that field equals the selected value
Using AutofilterQuerying Customer Records at HyTex • Records are temporarily hidden (not deleted) • Setting filter on multiple lists implies “and” condition • Select “All” from list to remove filter for that field • Select “Top Ten” to see top/bottom of items or top/bottom percent • To see the entire database: select clear option in Sort and Filter group
Complex Queries with AutofilterQuerying Customer Records at HyTex • The “Custom” item allows us to enter up to two conditions for any field (can be “and” or “or” conditions) • Choose from a number of types of conditions such as “greater than some value,” “contains some value,” “does not equal some value,” and others. • Use “wildcard” characters ? and * (useful for text data) • ? stands for any single character • * stands for a series of characters • Example: perform a query where Salary must be between $40,000 and $80,000, History must not be “NA” and Catalogs can either be 6 or 24
Performing Analysis on Query Results Querying Customer Records at HyTex • Once we obtain the results of a query we might want to summarize these results in some way. • If you use functions/add-ins, when you select the result cells from a query, all hidden rows are included • The trick is to copy and paste the results of the query to a new location.
Excel’s Advanced Filter Tool • Excel’s advanced filter tool can be used only after creating a “criteria range” • The “criteria range” consists of a top row of field names and one or more rows of conditions. • To use the advanced filter tool: • Copy the row of field names to any unused area (this is the top row of the criteria range) • Manually enter conditions in the cells just below those field names • Conditions in a given row are “and” conditions • Conditions across rows are treated in an “or” manner
Advanced Filter Example Querying Customer Records at HyTex • First row: all customers who are married and have salary at least $80,000 and have at least 2 children. • Second row: all customers who have salary $100,000 and received at least 12 catalogs. • Return all records that meet the first row or second row conditions
Works like Autofilter Entire database Include field names and all conditions Can specify upper left corner of range Advanced Filter Example Querying Customer Records at HyTex • Once the criteria range is created, we run the query by using the Data/Filter/Advanced Filter menu item. • The associated dialog box is completed as follows.
Another Advanced Filter ExampleQuerying Customer Records at HyTex • Suppose we want the customers who are either (1) male with salary between $40,000 and $50,000, or (2) female with salary over $70,000 • How should this condition be entered in the criteria range? It is tempting to enter the label ”>40000,<50000” in a cell under Salary. However, this doesn’t work. • One solution is to enter two Salary fields in the criteria range as shown on the next slide
Another ExampleQuerying Customer Records at HyTex • There is no rule in Excel that every field name must be included in the criteria range. Only those involved in the query are required. • Also the same field name can be included more than once, evidently to deal with the situation we have posed. • The criteria range above will return exactly the records we seek.
Importing External Data from Access Importing customer orders from a database for the Fine Shirt Company
Background InformationCustomer Orders for the Fine Shirt Company • The Fine Shirt Company has a database of historical customer retail shirt sales information that we want to analyze • There are three related tables in this database: Customers, Orders, and Products • Customer table (info on seven customers): CustomerNum, Name, Street, City, State, Zip, Phone • Products table (info on ten shirt types): ProductNum, Description, Gender, UnitPrice • Orders table (info on 2245 orders): OrderNum, CustomerNum, ProductNum, OrderDate, UnitsOrdered, Discount
Introduction to Relational Databases • Excel “databases” are flat files or lists • They are also called single-table databases, where table is the usual database terminology for a rectangular range of data • Flat files are fine for relatively simple database applications, but are not powerful enough for more complex applications • A relational database is a set of tables, where each table is a rectangular arrangement of fields and records, and the tables are linked explicitly. • Microsoft Query (EXCEL add-in) allows import of data from many database packages with very little work
Single Table Query Example Customer Orders for the Fine Shirt Company • Find all of the records from the Orders Table in the Shirtorders database that meet the following conditions: • Order date is during the years 1997 or 1998 • Product number is 3 or 5 • Number of units ordered is at least 100 • Return to Excel all fields in the Orders Table for these records
Importing Data with Microsoft QueryCustomer Orders for the Fine Shirt Company • Import is a three step process done in Excel and Query, not Access • Step 1: Define a “data source” • Select Data/Import External Data/New Database Query menu item, then select “<New Data Source>” from Database tab of Data Source dialog box • In the Create New Data Source dialog box, give a name to the data source, select appropriate driver (Microsoft Access), click Connect, then Select and choose the database file & click OK a couple of times to get back to the Data Source dialog box • Make sure that the data source you defined is highlighted in the Data Source dialog box, then click OK (should see Add Tables dialog box)
Importing Data with Microsoft QueryCustomer Orders for the Fine Shirt Company • Step 2: Specify the data you want • Select the appropriate table and click Add (click Close when completed) • The table should appear in the top part of the window • Double click on fields you want returned to EXCEL (click on * to return all fields) • The data should appear on the bottom part of the window • Click on the “Show/Hide Criteria” button on the toolbar to view the criteria section in the middle of the window
Importing Data with Microsoft QueryCustomer Orders for the Fine Shirt Company • Step 3: Specify criteria and return data to EXCEL • Entering criteria is similar to the Advanced Filter Tool • Can type in criteria or use Criteria/Add Criteria menu item • For Add Criteria dialog box, use pull-downs to select field and operator, click on Values to select values from records (or can type in values box) • If you type anything in, be careful with syntax • Can select more than one value in value box • Records in bottom part of window are filtered each time a new criteria is added • Select File/Return Data to Microsoft Excel and select location of data
Working with Imported DataCustomer Orders for the Fine Shirt Company • You can now analyze the data using any EXCEL features • The imported data are still linked to the query • With the cursor anywhere in the data, select the Data/Get External Data/Edit Query menu item • This takes you back into Microsoft Query, with your criteria showing • Any changes in criteria will be reflected in the Excel data • You can also select File/Save in Microsoft Query to save the query for later use (select Data/Get External Data/Run Saved Query)
Creating a Complex Query Customer Orders for the Fine Shirt Company • Find all records that meet the following conditions: • All orders for at least 80 units • Order made by the customer “Shirts R Us” (customer number 3) • Product is Long sleeve Tunic (product number 6) • Return the order date and units ordered for these orders. • The main difference is that we now have to base the query on all three tables in the database
Creating a Complex Query Customer Orders for the Fine Shirt Company • The trick is to use the links between the tables • Create a new query by clicking on the data source created from the previous example • Inside the Query screen, first add all three tables by using the Tables/Add Table menu item • Double click the fields you want to return to EXCEL • Create the appropriate criteria (fields will be displayed as table.field)
Exploring Data with Pivot Tables Examining effects on elevator rail diameters for Otis Elevator
Background InformationRail diameters for Otis Elevator • Spreadsheet contains the diameters (in inches) of elevator rails produced by Otis Elevator’s two machines and two operators • Each diameter corresponds to a particular machine/ operator combination. • What effects, if any, do the operator and machine have on elevator rail diameters?
Pivot Tables • Pivot tables provide an incredible amount of of useful information about a data set • Pivot tables allow us to break the data down into subpopulations • Statisticians often refer to the resulting tables as contingency tables or crosstabs
Creating a Pivot Table • Position the cursor in the data range • Select PivotTable from Insert Ribbon • Select the data source and type of report (table or chart/table) • Specify the range of the data set (should be set properly if you placed the cursor in the data range) • Drag appropriate fields to the PivotTable Field List on right side of workbook