1. An IF command in the DATA step. For example, the following program (select1.txt) would print the pulse data for males only:2. A WHERE command in the PROC step. For example, the following program (select2.txt) would print the data for males only:DM 'CLEAR LOG'; DM 'CLEAR OUTPUT'; OPTIONS LINESIZE=72 NODATE NONUMBER; DATA file1; INFILE 'a:\pulsedat.txt'; INPUT pulse1 pulse2 ran smoker gender height weight activity; IF gender = 1; /* equivalently IF gender = 2 THEN DELETE;*/ PROC PRINT; TITLE 'The pulse data for males only'; RUN;DM 'CLEAR LOG'; DM 'CLEAR OUTPUT'; OPTIONS LINESIZE=72 NODATE NONUMBER; DATA file1; INFILE 'a:\pulsedat.txt'; INPUT pulse1 pulse2 ran smoker gender height weight activity; PROC PRINT; WHERE gender=1; /*equivalently WHERE gender NE 2; */ RUN;
DM 'CLEAR LOG'; DM 'CLEAR OUTPUT'; OPTIONS LINESIZE=72 NODATE NONUMBER; DATA file1; INFILE 'a:\pulsedat.txt'; INPUT pulse1 pulse2 ran smoker gender height weight activity; DATA subset; SET file1; IF ran = 1; PROC PRINT DATA=file1; /* prints the entire data set */ TITLE 'The entire data set'; PROC MEANS DATA=subset; /* For runners only prints sample size, */ VAR pulse1; /* the mean, standard deviation, min, */ TITLE 'Runners only'; /* and max for pulse1 */ RUN;
would output the mean, standard deviation, minimum, and maximum of the variable pulse1 for each gender.PROC MEANS; VAR pulse1; BY gender;
The use of a BY statement in SAS procedures, other than PROC SORT, require that the data are sorted. That is, the values of the variables are ordered in descending (largest to smallest) or ascending (smallest to largest) order. PROC SORT is used to sort data in SAS. This procedure can be used to sort numeric or character data (A to Z). The following example illustrates the use of PROC SORT on the raw data set pulsedat.txt described in Lesson 3:
Click sort1.txt, use EDIT, SELECT ALL, COPY, paste this program into the SAS program window and run it. You must have pulsedat.txt on the floppy disk in drive a for the program to run successfully.DM 'CLEAR LOG'; DM 'CLEAR OUTPUT'; OPTIONS LINESIZE=72 NODATE NONUMBER; DATA file1; INFILE 'a:\pulsedat.txt'; INPUT pulse1 pulse2 ran smoker gender height weight activity; PROC SORT DATA=file1; BY gender weight pulse1; PROC PRINT; VAR gender weight pulse1 pulse2 height activity ran smoker; RUN;
Examine the Log window and Output window carefully to understand the sorting that occurred. Note that SAS sorts first by gender (smallest 1= male to largest 2= female), then by weight (smallest to largest), then by pulse1 (smallest to largest).
If you have the SAS permanent file LIBRARY.pulse with variable labels and value labels you created in Lesson 3 on a floppy disk, you can run the same sort as above by Clicking sort2.txt, use EDIT, SELECT ALL, COPY, paste this program into the SAS program window and run it. Be sure to have your diskette with LIBRARY.pulse in the floppy drive.
The example program below illustrates the use of PROC SORT followed by PROC MEANS using a BY subcommand on the pulse data set:
If you have your copy of LIBRARY.pulse on diskette, click means1.txt, use EDIT, SELECT ALL, COPY, paste this program into the SAS Program window and run it. Examine the Log window to see how SAS signifies the sorts and examine the Output window to view the kind of results it gives.DM 'CLEAR LOG'; DM 'CLEAR OUTPUT'; OPTIONS LINESIZE=72 NODATE NONUMBER; LIBNAME LIBRARY 'a:\'; PROC SORT DATA=LIBRARY.pulse;/* Sorts the data by gender. */ BY gender; PROC MEANS; /* Finds the sample size, mean, standard */ VAR pulse1; /* deviation, minimum, and maximum of */ BY gender; /* pulse1 for each gender. */ PROC SORT; /* Sorts the data by gender and activity.*/ BY gender activity; PROC MEANS; /* Finds the sample size, mean, standard */ VAR pulse1; /* deviation, minimum, and maximum of */ BY gender activity; /* pulse1 for each gender and activity */ RUN; /* combination. */
1. Answer the following questions concerning PROC SORT:
a. Is the default to order values in ascending or descending order?2. Using the permanent SAS data set LIBRARY.pulse described in Lesson 3, do parts a and b below using a single SAS program:b. How do you change the default ordering?
c. If missing values occur in a data set for a variable and the data is sorted in ascending order on that variable, where do the missing values appear, at the beginning or at the end of the list? You should try this by editing pulsedat.txt to include some missing values (.) in pulse1 then try a sort on pulse1. A narrative answer is all that is required for the homework.
a. Use PROC MEANS (set linesize =110) to output the average height of males that ran in place (a data subset). Include a title in this procedure which describes what is being done; see Lesson 1, and3. Without sorting the data (see section 5.9; CLASS subcommand) write a single PROC MEANS command with options which outputs the sample size, mean, and standard error of the mean for weight, height, and bulk=weight/height for each gender. You should run your command to test it but you do not have to turn in the program in your homework; you need only include the command in your write up.b. Using the entire data set, use PROC TABULATE to produce two tables; one for smokers and one for non smokers. Each table should present the sample sizes and means for weight for each combination of gender and activity. Include a title in the procedure to describe what is being done. You must use a PROC SORT with BY smoker as a start. Then the following PROC TABULATE commands:
PROC TABULATE DATA = your temporary file name;
VAR weight;
CLASS gender activity;
TABLE = gender* activity * weight*(N, MEAN);
BY smoker;