1 / 10

The New SAS Engine For CRSP

Benefits. Access native CRSPAccess Databases.Most SAS Dataset options work without modification (including keep, rename, and drop).Native SAS datasets and SASECRSP dataset can be used in the same statements.Procedures can directly access SASECRSP datasets.Supports random access on PERMNO.Suppor

mandel
Download Presentation

The New SAS Engine For CRSP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. The New SAS Engine For CRSP Benefits of using SAS engine for CRSP Setting up the interface between SAS and CRSP Examples of use Performance Benchmarks

    2. Benefits Access native CRSPAccess Databases. Most SAS Dataset options work without modification (including keep, rename, and drop). Native SAS datasets and SASECRSP dataset can be used in the same statements. Procedures can directly access SASECRSP datasets. Supports random access on PERMNO. Supports global & PERMNO specific date restrictions. Supports functions and formats specific to CRSP trading calendars. Often requires no intermediate files. Saving coding time, CPU time, I/O, & Disk space.

    3. Setting up the Interface After CRSP Stock Data and SAS/ETSTM 8.2 are installed independently. Two notes on installation - CRSPDB_SASCAL environment variable must be set to you latest CRSP data. SASECRSP is currently only available for Windows and HP/UX. After that, the interface can be accessed using simple extensions to the SAS libname statement. For instance, To access CRSP daily or monthly sample database respectively libname dstk sasecrsp ‘c:\dasample\' setid=10; libname mstk sasecrsp ‘c:\masample\' setid=20; The data files being accessed are the same ones that are accessed by FORTRAN, ‘C’, ts_print, or stk_print. Therefore disk space is only needed for one installation regardless of how users access the data. This saves not only disk space, but labor each time new data is shipped.

    4. Printing Headers Example 1: Use of keep statement, and direct use of the proc print procedure proc print data = mstk.stkhead (keep = permno hcomnam begdt enddt); Obs PERMNO BEGDT ENDDT HCOMNAM 1 10107 19860331 20010731 MICROSOFT CORP 2 10823 19251231 19980529 KEYSPAN ENERGY CORP 3 12079 19251231 20010731 GENERAL MOTORS CORP 4 12490 19251231 20010731 INTERNATIONAL BUSINESS MACHS COR 5 14218 19251231 19890428 R J R NABISCO INC 6 14322 19251231 20010731 SEARS ROEBUCK & CO 7 15069 19251231 20010731 U S X MARATHON GROUP 8 18948 19300430 19961231 ROCKWELL INTERNATIONAL CORP 9 24360 19500731 20010731 KEYSPAN CORP 10 28353 19620131 20010731 RALSTON PURINA GROUP 11 52070 19710331 19841031 ELECTRONIC DATA SYS CORP 12 66931 19841031 19960628 GENERAL MOTORS CORP 13 68451 19851231 20010731 GENERAL MOTORS CORP 14 75228 19850830 20010731 IVAX CORP 15 75390 19881031 19990831 DEVON ENERGY CORP 16 76597 19910328 20001229 NABISCO GROUP HOLDING CORP 17 78946 19930226 19970530 DEAN WITTER DISCOVER & CO 18 79323 19930630 20010731 ALLSTATE CORP 19 79348 19930831 19950531 RALSTON CONTINENTAL BAKING GROUP 20 81191 19950131 20001229 NABISCO HOLDINGS CORP 21 83596 19960628 20010731 ELECTRONIC DATA SYSTEMS CORP 22 84381 19961231 20010731 ROCKWELL INTERNATIONAL CORP NEW 23 85349 19971031 20010731 ARVINMERITOR INC 24 86496 19990129 20010731 CONEXANT SYSTEMS INC 25 87137 19990831 20010731 DEVON ENERGY CORP NEW

    5. Average Returns Example 2: Demonstrates date restriction on a libname, direct use of proc means, and output of native SAS datasets It is easy to restrict data to a specified date range before passing it to SAS for calculation. To Calculate Average monthly returns for all the securities in the sample database for the period of January, 1996 – December, 2000, use the following commands. libname mth5yr sasecrsp 'c:\masample\' range='19960131-20001229' setid=20; proc means data=mth5yr.ret noprint; var ret; by permno; output mean=avgret out=avgret; run;

    6. Works with Proc SQL Example 3: Demonstrates SASECRSP and native SAS dataset in the same statement. /* Join with the company names and restrict to securities with exactly five years (60 months) of non-missing returns, and sort output by the avgret. */ proc sql; select a.permno, a.hcomnam, b.avgret from mth5yr.stkhead a, avgret b where a.permno=b.permno order and b._FREQ_ = 60 by avgret; run; PERMNO Company Name Header Returns __________________________________________________________________________ 14322 SEARS ROEBUCK & CO 0.004601 12079 GENERAL MOTORS CORP 0.010279 28353 RALSTON PURINA GROUP 0.013447 15069 U S X MARATHON GROUP 0.013855 79323 ALLSTATE CORP 0.018369 68451 GENERAL MOTORS CORP 0.022518 24360 KEYSPAN CORP 0.024243 75228 IVAX CORP 0.026936 12490 INTERNATIONAL BUSINESS MACHS COR 0.027325 10107 MICROSOFT CORP 0.031976

    7. Performance Metrics Example 4: The SASECRSP engines performs at near CRSPAccess ‘C’ native speed. This was run on the full December 2000 daily database on a local drive of Pentium III 733 MHz running Windows 2000. This example reads over 50 million return observations, and calculates the minimum, maximum, mean, and standard deviation in approximately 2 ˝ minutes. libname dlyall sasecrsp 'c:\da200012\' setid=10; proc means data=dlyall.ret; var ret; run; N Mean Std Dev Minimum Maximum ___________________________________________________________________ 51398971 0.000845599 0.0442346 -0.9716981 12.7777777 ___________________________________________________________________ (Reads 205 MB and over 51 million observations in 2 minutes and 34 seconds) WARNING: Defaulting to selecting all PERMNOs in CRSP database. NOTE: There were 52365201 observations read from the dataset DLYALL.RET. NOTE: PROCEDURE MEANS used: real time 2:34.78 cpu time 2:31.12

    8. Performance Metrics Example 5: This example demonstrates that random access by PERMNO is very efficient. Accessing the same underlying 51,000,000 observation dataset as the previous example, accessing just a single permno, takes just a faction of a second.

    9. Performance Metrics Example 6: This example demonstrates how a full pass of the header data can be made to restrict the data and then used to create an input list with date ranges. This input list is then used to restrict access to the 51,000,000 observation dataset from the first performance example, and it access just the securities and dates requested. Creating the list of securities, takes less than two seconds. The restricted list takes a little more than ˝ minute. /* This example has two parts. First it does a subset to just companies that delist due to mergers and creates a dataset of PERMNO, delist date, and 200 trading days before the delist date. The second part on the next screen, then then subsets based on that universe of perms */ data mrgdt (keep=permno date1 date2); set dlyall.stkhead; where dlstcd >= 200 and dlstcd <= 299; crspenddt = crspdicd(enddt,0); /* convert enddt to a CRSP calendar date */ date1 = crspdcid(crspenddt - 200); /* calculate 200 trading days prior to enddt and store in YYYYMMDD format */ if date1 ne . and date1 >= begdt; /* subsetting if - ignores companies that do not have 200 trading days of CRSP data available. */ date2 = enddt; run; NOTE: There were 7790 observations read from the dataset DLYALL.STKHEAD. WHERE ((dlstcd>=200 and dlstcd<=299)); NOTE: The dataset WORK.MRGDT has 7640 observations and 3 variables. NOTE: DATA statement used: real time 1.21 seconds cpu time 0.42 seconds

More Related