Data Mining&Business Planning of Engineering/Research Projects. Practice 4 Dr. Gábor Pauler, Associate Professor Faculty of Sciences, University of Pécs Tel:30/9015-488 E-mail: pauler@t-online.hu.

  1. Data Mining&Business Planning of Engineering/Research Projects Practice 4 Dr. Gábor Pauler, Associate Professor Faculty of Sciences, University of Pécs Tel:30/9015-488 E-mail:pauler@t-online.hu

  3. Content Design of Questionnaires: Sections structure 1 • We show CarSculpturers Questionnaire with 9sections in 4 pages as an example, how to build your own questionnaire: • Section 1: it introduces respondent to the topic, containing 8 attitude questions with 6 degree agreement scale.They are statements hotly debated in FG • Try to avoid trivial attitude questions, because everybody will agree/disagree with that, resulting no real information: 46.Government should lower taxes • A good attitude question always contain benefits against costs of an action which affects the targeted customers personally: 46.Government should lower taxes even if it may increase my tuition fee • Don’t „push forward” your favorite attitude questions, because it will distort analysis: 46.Grandmothers should be cool 47.Grandmothers should bungee-jump 48.Grandmothers should roller blade • Ask all sides of the problem balanced: 47.Granma should keep up with children 48.Granma has serious health problems Factor Analysis (Faktor Analízis) used to process attitude question will show „cool grandmahood” factor if it is really there! • Section 2:asks for importance of 13product features collected by FG. They will be used in Cluster Analysis (Klaszter Analízis) to set up market segments. You should not group customers by how much they love you (your product can be bad), but by their general product feature preferences!

  4. Content Design of Questionnaires: Sections structure 2 • Section 3: ask for evaluation of 10 competitor products × 7 features = 70 fields. This is the most exhausting part, so we placed after the introduction, and allow to evaluate features on 5 school grade scale (don’t forget clarify meaning!) • Also you should express that evaluation should be filled only if there is real expe-rience. It can have a „boomerang”-effect: every youngsters tries to fill everything to show up experience in cars • Section 4: is a relaxation between more difficult parts asking media usage in 3 multi response questions forming 28 binary fields at the database in total. • In case media usage is more important (eg. TV market research) than now, there can be a binary variable at each channel×topic×time period combination! • Section 5: is the most sensitive part, as-king for intension to buy a car and Buying Power (Vásárlóerő). Therefore we start with a very neutral section header and gradually narrow down the topic (see 37, 38, 39) to avoid rejection or over-stating buying power (in FG, most youngsters intended to buy Volvo S80 or BMW X5…) • To keep them in real world, 38 asks for car budget and 40 asks about knowledge of basic price of the desired car

  5. Content Design of Questionnaires: Sections structure 3 • Section 6: asks intended spending on ex-tras grouped in 4 groups (Comfort, Safety, Security, Design) in 5 categories to pre-vent under/overstating. Financial consis-tency of respondent can be measured as a ratio of gap between car budget and base price with extra spending: Consistency% = (Budget-Base) _ (4.8) (Comf+Safe+Secur+Design) • This way we avoid using control questions and consistency is measured based on 6 questions, which is harder to cross-check and re-compute for fake respondents • Section 7: asks for evaluation of innova-tive product ideas. Be careful that you should not press your ideas! Questionna-ire is NOT trade presentation! – otherwise you will get distorted results. Ideas should be described in 1 objective sentence. Most of them are evaluated financially: • There is always Not buy option first to leave the possibility of rejection • Then ask whether he/she will buy at low-, medium-,high unit price (prices and suitable price unit come from FG) • This will be matched with planned pe-riod (<1yr, 1..3yr, >3yr) of car buying • As this section is pretty difficult to fill, in 58 we give an opportunity for respondent to write his/her own ideas in an open ques-tion. This is clearly for motivation, and we do not process it (most of the time they wrote a version of the ideas already there, just re-worded)

  6. Content Design of Questionnaires: Sections structure 4 • Section 8: asks about car ads. Open questions 59-62 are not processed, they just help to refresh respondents memory to answer the multi-response 63 containing 6 binary fields • Section 9: asks for most sensitive socio-demographic data, where danger of re-ject answering is the biggest. Therefore we used the following techniques: • Section is placed at the end of question-naire, when respondent already invested time in filling, and also may be tied to the topic by his own ideas given in 58 • At the beginning of the section he/she is confirmed once again that data will be processed in personally non-trackable format, and we do not pass it to anyone • Moreover interviewer tells the legal ruling which defends respondents: (in Hungary: II. Chapter of Bill LXVI. in 1992: „A pol-gárok személyi adatainak és lakcímének nyilvántartásáról”which sounds good, but not really serious: everything is forbid-den, but there are no serious sanctions forcing that limits) • Questionnaire tested: on paper, in a 10 person sample, who did not take part in the design, to avoid „thinking inside the box”-effect. Based on test query: • Number of features queried in Section 3 was decreased from 10 to 7 • Education levels in Section 9 were re-worded more understandable.

  8. On the questionnaire, we have to apply the following set of questions to make de-mand estimation working: In the introduction of „Product ideas” chapter of the ques-tionnaire, „how valuable they are” is the buzzword. It is followed by short and OBJECTIVE introduction of the product idea. The ques-tionaire IS NOT promotion tool. Even slight pressure on respondents will totally distort results! First question is about unit price of the product unit given by focus group. The first price is always 0 (don’t buy), follo-wed by low-, medium-, high prices set up at focus group. The second question is about product lifecycle presented in simple english ( you should not bomb the respondents with professional terms in the questionnnaire) The first al-ternative here is also don’t buy. Then, respondents can select best fitting lifecycle model to their behavior. PaulerSoft™ BusinessPlanner 2.0: Required questions 1

  9. The third question is about quantity of product units plan-ned to purchase in 5 conse-cutive future consumption pe-riods (Fogyasztási Periódus): this is the shortest time the consumer on the tartgeted market able to make autono-mous purchasing decisions. Its lenght highly depends on the type of products (eg. bre-ad: 1day..buildings: 5-25year-s).It can have identical lenght or multiples of financial de-sign period (Pénzügyi Terve-zési Periódus) of the client. Lenght of consumption peri-ods should be selected that way to allow 5 periods to co-ver possible major turning po-ints in product lifecycle (take off, peak, decay, exit) Important: we always offer the possibility to the respon-dent giving zero quantities in case he will not purchase, to avoid pressure on him Consistency of the respon-dent can be checked compa- ring future quantities and life-cycle pattern selected (eg. if increasing quantities AND buy once is selected, we shall dispose respondent) PaulerSoft™ BusinessPlanner 2.0: Required questions 2

  10. Content Design: Paper-specific Design: Document Template • In QuestionnaireTemplate.doc we see a document template for preparing paper-based questionnaires. Never-ever start editing questionnare from blank docu-ment, because it will take forever! Hea-vily rely on pre-set styles and hot keys: • On Header(Fejléc), there is survey name and auto-date field • Title of questionnaire is Title(Cím) style, its hot key is:Alt+T • Explanation texts are in Normal(Normál szöveg) style, its hot key:Alt+N • We can break questionnaire to sections/ subsections with Header1-2 (Címsor1-2) styles, their hot keys are:Alt+1, Alt+2 • Questions are in Numbered List 1 (Sor-számozott lista 1) style, hot key: Alt+Q This is sequentially numbered through the whole document. Don’t forget to state very clearly, whether they are single- or multi-choice questions! • Alternatives are Bulleted List 2 (Ponto-zott lista 2) style, hot key:Alt+Y Its num-bering should be restarted with 1 within each questions, but it cannot be done automatically in Word. Therefore Right-Click on first alternative+Restart Numbe-ring(Számozás újrakezdése) menu. • Boxes are placed automatically, and they can be shifted between O(single-) and □(multi-response) characters with Format| List|Custom|Bullet (Formáz|Felsorolás| Egyedi|Listajel) menu Right Click Restart numbering Click • If there are more than 3 alternatives within a question, the list should be broken up 2 co-lumns to save space. This can be done only manually at Format|Columns|Columns=2 (Formátum|Hasábok|Hasábok=2) menu • Messages are Message Header (Üzenetfejléc) style, hot key:Alt+M • On Footer(Lábléc),there is auto pagenumber field and TURN TO NEXT PAGE! warning

  11. Content Design: Paper-specific Design: Dead Node trick Has no car yet? Has driving license? Has money? • A question frequently has logic interdepen-dencies with other questions making them invalid (Eg. if it turns out that respondent is male, there is no sense to ask about own pregnacies) • On electronic forms, this can be easily sol-ved by question control rules, but on paper it is a more serious problem: you have to make a Jump order (Ugrás utasítás) for the respondent depending of what he/she ans-wered using Message Header style (Alt+M) to avoid invalid questions • If logic interdependencies Branch (Elágaz) in Decision Tree (Döntési fa), you have to use numerous jumps, which simple respondents cannot follow • If decision tree has a special format called Onion Skin Structure (Hagymahéj-Szerke-zet): nested preconditions building on each other sequentially (Eg. at car purchase: • Has no car yet? • Has driving license? • Has the money? • Cannot borrow car?) • Then, we can use Dead Node (Halott Ág) technique to simplify logic structure of pa-per-based questionnaire: in all questions there is alternative 0 for chanelling together all responses making the whole process in-valid, this way we avoid lot of jumps Dead node

  12. Checking Home Assignment 3: Focus group Primary quantitative research tool: Survey Content design of Questionnaires Sections Structure: CarSculpturers Questionnaire Required questions for estimating demand/ revenue/ profit Paper-specific design: Document Template Dead Node trick Test query Generating Final Field List Home Assignment 4: Paper-based questionnaire design References Content of the Practice

  13. Content Design: Test query, Final Field List 1 • After finishing prototype of the questionnaire, we organize Test Query (Teszt Lekérdezés) with 5-10 respondents: • We measure Average Fill Time (Átlagos Kitöltési Idő): this is important to calculate fee of interviewer or to inform respondents about it in cover letter • Test respondents should not participate in design to avoid „thinking inside the box”effect • Ambigous questions are filtered with Protocol Analysis (Protokoll-Analízis): questionnaire designers monitor test respondents during filling. If they spend unusual long time answering a question, or give strange answers, filling (and time measurement) is suspended, and respondent is asked to tell, how did he/she think. • After correcting questionnaire, we prepare the Final Field List (Végleges Mezőlista) (see CarSculpturersFieldList) which is the base of designing survey database:

  14. Content Design: Final Field List 2 • Database Manager Software (Adatbáziskezelő) (Eg. Oracle, MSSQL, MySQL, Access) store the following info about fields in Database Table Structure (Adatbázistábla Struktúra): • Field Names (Mezőnevek): • Should be 8±3 character-long, very short names (Eg. V1, V2, V3)are not informative • Too long names (Eg. ExtendedPayBackPeriod) make SQL coding difficult later • Use meaningful english abbreviations (Eg. ExtPayBPer) even if questionnaire has different native language, because non-native people may also analyze it later! • Use name Prefix (Előtag) to denote questionnaire section (Eg. Importances: ImpEngin, ImpBody, ImpFuel) this way fields of a section are kept together even if field name list is re-ordered from physical order to alphabetic order (in MS Query ) • Multi field name parts can be separated with starting capital (Eg. ExtPayBPer) or underscores (Eg. ext_pay_b_per) • Don’t use special (except underscore) or Accented (Ékezetes) characters, don’t start them with number (it could be used in MS Excel and Access but Oracle or SPSS won’t accept) • Field Type (Mezőtípus), Storage Size (tárolási méret), Range (Értékhatár) • Field Description/Label Változó címke (Mező szöveges leírása/címkéje): it should match with question text on the questionnaire

  15. Content Design: Final Field List 3 • Statistical software (Eg. STAT, SAS SPSS) can store some extra info about fields: • Field Number (Mező Sorszám): numbering of question-fields increased by 10s (Eg. Question46 = 460), numbering of multi-response binary fields falls within the questions 10-range (Eg. Alt1 = 461, Alt2 = 462…) • Value Labels (Értékek Címkéi): for discrete variables (Eg. 1:Male, 2:Female - statisticans are NOT feminists!) • Missing Values (Hiányzó Értékek): in most database systems, missing values are denoted with Null (DO NOT MIX with 0:Zero, which is a numeric value!!!), but there can be extra missing codes denoting the reason (Eg. 99:Data corruption, 98:Recording error, 97:Refused answer) • Scale Type (Skálatipus): it determines, which operations can be performed with the field: • Nominal (Nominális):#,↓↑,± ,/ Eg. Names or ID numbers:not quantity! • Ordinal (Ordinális): #,↓↑,± ,/ Eg. Education levels: unequal stages • Interval (Intervallum):#,↓↑,± ,/ Eg. Time: equally paced but no 0 point • Ratio (Arányskála): #,↓↑,± ,/ Eg. Milk,pints it has absolute 0 point • Extra field info stored in Enterprise Resource Planning, ERP (Vállalatirányítási, VIR) systems (Eg. SAP, Navision, Baan): • Measure Unit (Mértékegység): (Eg. none, %, pieces, $, $/pieces) this is useful when difficult multi-dimensional computations are performed with the field • Text Labels (Szövegcímkék): translations of field/value labels into multiple languages, for global usage

  16. Home Assignment 4: Paper-based questionnaire design • Project teams should design paper-based questionnaire in at least 4 pages lenght (use CarSculpturers Questionnaire as template to avoid forgetting important things), seeking answers on following questions :(1point) • A: General attitudes toward given category of product, from debated points of FG(1:disagree, 6:agree)? • I: How important product features mentioned in FG are:(1:less, 6:very)? • P: How your planned product performs in features (1:bad,5:good)? • C: How well competitors perform in features (1:bad,5:good)? • B: Where, when customers usually buy the product category and how much they usually spend? • N: How customers will buy your planned product: • NP: How much they are willing to pay for one unit (suitable unit and prices come from FG)(Only for free, Low-, Medium-, High price)? • NL: What they think about shopping pattern of product? (No buy/ worth to buy only once/ buy as long as it fashionable / buy it constatly from time to time / buy it at inceasing quantities) • NQ: How much product units they plan to by in 5 consecutive consumption periods (suitable period comes from FG)? • NF: From which source they would finance buying the product, in which proportion? • M: On which medium customers can be reached, at which topic, at which time? • S: What is customers socio-demographic background (Gender, Age, Education, Housing, Hometown-type, Occupation, Income)? • Then questionnaire should be test-queried 5-8 external persons DID NOT PARTICIPATE in questionnaire design:(1point) • Then correct errors of questionnaire (submit list of corrections in separate *.DOC and assemble variable list in Excel similar to CarSculpturers FieldList:(1point)

  17. References • Theory of questionnaire design: • http://www.hik.hu/tankonyvtar/site/books/b156/ch03s05s01.html • http://www.statpac.com/surveys/ • http://www.quickmba.com/marketing/research/qdesign/ • http://www.cc.gatech.edu/classes/cs6751_97_winter/Topics/quest-design/ • http://www.surveysystem.com/sdesign.htm • http://piackutatas.lap.hu/ • Questionnaire planner software: • Pocket survey: http://www.pocketsurvey.info/ 30 days shareware

