90 likes | 189 Views
a useful but no longer documented procedure *. Dr. Arthur Tabachneck Director, Data Management. * based on a SAS-L post by Paul Dorfman. suppose you had the following file.
E N D
a useful but no longer documented procedure * Dr. Arthur TabachneckDirector, Data Management * based on a SAS-L post by Paul Dorfman
suppose you had the following file filename temp temp;data _null_; file temp; informat sentence $100.; input sentence &; put sentence; cards;Let's see if sas spell procdure can be used to verify whether tha seperate words in this, uhm, flie are, uhm, valid against a stantard internal dictionary and let’s see how versatile it is;
and you needed to find: the frequency that each word was used which words were misspelled suggestions for correcting any misspelled words only misspellings and non-standard acronyms
how to find: the frequency that each word was used you get: word Freq Line(s)Let's 1 1see 2 1, 5if 1 1sas 1 1spell 1 1procdure 1 1can 1 1be 1 2used 1 2to 1 2verify 1 2whether 1 2tha 1 2separate 1 2words 1 3in 1 3 word Freq Line(s)this 1 3uhm 2 3 (2)flie 1 3are 1 3valid 1 3against 1 4a 1 4stantard 1 4internal 1 4dictionary 1 4and 1 5let's 1 5how 1 5versatile 1 5it 1 5is 1 5 if you submit: proc spell in=temp nomaster;run;
how to find: the frequency that each word was used if you submit: you get: options caps;filename temp temp;data _null_; file temp; informat sentence $100.; input sentence &; put sentence; cards;Let's see if sas spell procdure canbe used to verify whether tha seperatewords in this, uhm, flie are, uhm, validagainst a stantard internal dictionaryand let's see how versatile it is;proc spell in=temp nomaster;run; word Freq Line(s)LET'S 2 1, 5SEE 2 1, 5IF 1 1SAS 1 1SPELL 1 1PROCDURE 1 1CAN 1 1BE 1 2USED 1 2TO 1 2VERIFY 1 2WHETHER 1 2THA 1 2SEPARATE 1 2WORDS 1 3IN 1 3 word Freq Line(s)THIS 1 3UHM 2 3 (2)FLIE 1 3ARE 1 3VALID 1 3AGAINST 1 4A 1 4STANTARD 1 4INTERNAL 1 4DICTIONARY 1 4AND 1 5HOW 1 5VERSATILE 1 5IT 1 5IS 1 5
how to find: which words were misspelled if you submit: you get: unrecgnized word Freq Line(s)SAS 1 1POCDURE 1 1THA 1 2SEPERATE 1 2UHM 2 3 (2)FLIE 1 3STANTARD 1 4 proc spell in=temp verify;run;
how to find: suggestions for correcting any misspelled words you get: unrecgnized word Freq Line(s) Suggestions:SAS 1 1 ASS, AS, GAS, HAS, PAS, WAS, SAD, SAG, SAP, SAT, SAW, SAY, SEAS, SPAS, SAGS, SAPS, SAWS, SAYS, SASHPROCDURE 1 1 PROCURE, PROCEDURETHA 1 2 TEA, THE, THY, THAI, THAN, THAT, THAWSEPARATE 1 2 SEPARATEUHM 2 3 (2) HUM, OHMFLIE 1 3 FILE, LIE, FIE, FLEE, FLIP, FLIT, FLIED, FLIER, FLIESSTANTARD 1 4 STANDARD if you submit: proc spell in=temp suggest;run;
how to find: only misspellings and non-standard acronyms if you submit: filename exclude temp;data _null_; file exclude; informat word $100.; input word; put word; cards;sas;proc spell words = exclude create dict = work.mycat.words;run;proc spell in=temp dict=work.mycat.words verify;run; you get: word Freq Line(s) PROCDURE 1 1THA 1 2SEPARATE 1 2UHM 2 3 (2)FLIE 1 3STANTARD 1 4
Questions? Your comments and questions are valued and encouraged. Contact the author: Dr. Arthur Tabachneck Director, Data Management Insurance Bureau of Canada Toronto, Ontario L3T 5K9 Email: atabachneck@ibc.ca