Performing Iterative Processing

Performing Iterative Processing

Many programming tasks require that blocks of code be run more than once. SAS provides several ways to accomplish this. This article covers DO groups and DO loops.

DO Groups

To demonstrate a DO group, we start with the data file grade.txt containing some information on students: their age, gender, midterm grade, quiz grade, and final exam grade.

You want to read values from this file and compute two new variables—age group (Agegrp) and a value (Grade) computed from the midterm and final exam grades. If the age is less than or equal to 39, you want to set Agegrp equal to the Younger group and the grade to be computed as a weighted average of the midterm grade (40%) and the final exam grade (60%). If the age is greater than 39, you want to set Agegrp equal to the Older group and compute the grade as a simple average of the midterm and final exam grades.??We can achieve this with IF-ELSE statements, but let's use DO and END statements to make the code more efficient and easier to read:?

data grades;?
length Gender $ 1?
? ? ? ?Quiz $ 2?
? ? ? ?AgeGrp $ 13;?
infile '/home/u41108928/Ron Cody Data/grades.txt' missover;?
input Age Gender Midterm Quiz FinalExam;?
if missing(Age) then delete;?

if Age le 39 then do;?
?Agegrp = 'Younger group';?
?Grade = .4*Midterm + .6*FinalExam;?
end;?

else if Age gt 39 then do;?
?Agegrp = 'Older group';?
?Grade = (Midterm + FinalExam)/2;?
end;?
run;?


title "Listing of GRADES";?


proc print data=grades noobs;?

run;        
No alt text provided for this image

All the statements between DO and END form a DO group. When the IF condition is true, all the statements in the DO group execute. A good way to think of this structure is “If the condition is true, do the following statements until you reach the end.” It is standard practice to indent all the statements in the DO group as shown here.?The DO group coding is not only more efficient than multiple IF statements, but it is also easier to read.


The Sum Statement?

There are two primary uses for a sum statement: one is to accumulate totals such as a month-to-date total and the other is to create a counter—a variable that is incremented by a fixed amount on each iteration of the DATA step.

Suppose you have a data set with one observation for each day of the week, and you want a program that will read in these values and compute a cumulative sum. The following program creates a test data set (Revenue) and attempts to create the cumulative sum.

data revenue;?
input Day : $3.?
Revenue : dollar6.;?
Total + Revenue;?
format Revenue Total dollar8.;?
datalines;?
?Mon $1,000?
?Tue $1,500?
?Wed .?
?Thu $2,000?
?Fri $3,000?
;
run;


title 'Using a sum statement to create a cumulative total';


proc print data= revenue;

run;        
No alt text provided for this image


Here is the output:


A sum statement takes the following form: variable + increment;

Notice there is a plus sign and no equal sign in this statement. That’s what identifies this as a sum statement to SAS. This statement does the following:

  • Variable is retained
  • Variable is initialized at 0
  • Missing values (of increment) are ignored?

Another very common use of a sum statement is to create counters, for example:?

data counter;?
input x;?
if missing(x) then MissCounter + 1;?
datalines;?
?2?
?.?
?7?
?.?
;
run;


title 'Using a sum statement to create a counter';


proc print data= counter;

run;        
No alt text provided for this image

As you can see from the output, MissCounter is counting the number of missing values for x.?


The Iterative DO Loop?

Although it is useful to execute a group of codes when a condition is true, there are times when you would like to execute a group of SAS statements multiple times.

You want to compute the total amount of money you will have if you start with $100 and invest it at a 3.75% interest rate for 3 years. (Yes, there are formulas for compound interest as well as SAS functions, but this makes for a good example.)

The OUTPUT statement is an instruction for SAS to write out an observation to the output data set. An output usually occurs automatically at the bottom of the DATA step. But here, you want to output an observation each time you compute a new Total. When you include an OUTPUT statement anywhere in a DATA step, SAS does not execute an automatic output at the bottom of the DATA step.?

data compound;?
Interest = .0375;?
Total = 100;?
do Year = 1 to 3;?
?Total + Interest*Total;?
?output;?
end;?
format Total dollar10.2;?
run;

title 'Total amount every year';

proc print data= compound;        

run;        

When this program executes, Year is first set to 1, the lower limit in the iterative DO range. All the statements up to the END statement are executed and Year is incremented by 1 (the default increment value). SAS then tests if the new value of the Year is between the lower and the upper limit (the value after the keyword TO). If it is, the statements in the DO group execute again; if not, the program continues at the first line following the END statement.?

No alt text provided for this image

Here is the output:



One form of an iterative DO statement follows:

do index-variable = start to stop by increment;         

If you leave off the increment, it defaults to 1.?

Another example to understand the DO loop: Suppose you want to generate a table of the integers from 1 to 10, along with their squares and square roots. An iterative DO loop makes simple work of this, as follows:?

data table;?
do n = 1 to 10;?
?Square = n*n;?
?SquareRoot = sqrt(n);?
?output;?
end;?
run;?


title "Table of Squares and Square Roots";?


proc print data=table noobs;?        

run;        
No alt text provided for this image

Notice that this program does not have any input data. It generates the value of n in the DO loop, computes the squares and square roots (SQRT is a square root function—it returns the square root of its argument), and outputs an observation to the data set. This continues for all the values from 1 to 10. Here is the output:?

What if you want a table where n has values of 0, 10, 20, 30, and so forth up to 100. The following DO statement does the trick:

do n = 0 to 100 by 10;         

DO loops can also count backward. For example, the following statement produces values of 10, 8, 6, 4, and 2 for the index variable:

do index = 10 to 1 by -2;         

When you use a negative increment value, the index value is decremented by this value for each iteration of the DO loop. When the value of the index variable is less than the stop value, the loop stops.

Here is another example. You have an equation relating X and Y and want a graph of Y versus X for values of X from –10 to +10. Again, an iterative DO loop makes this very easy:?

data equation;?
do X = -10 to 10 by .01;?
? ?Y = (2*X**3)-(5*X**2)+(15*X)-8;?
?output;?
end;?
run;?

symbol value=none interpol=sm width=2;?

title "Plot of Y versus X";
?
proc gplot data=equation;?
?plot Y * X;?        

run;        
No alt text provided for this image

The DO loop starts X at –10 and increments the value by .01 until X reaches +10. The OUTPUT statement inside the loop writes an observation out to the data set for each iteration of the loop.?PROC GPLOT is used to plot the line.


Other Forms of an Iterative DO Loop

SAS provides several other methods of specifying how a DO loop operates. You can provide a list of numeric or character values following the index variable. Here are some examples:

do x = 1,2,5,10; (values of x are: 1, 2, 5, and 10)

do month = 'Jan','Feb','Mar'; (values of month are: 'Jan', 'Feb', and 'Mar')

do n = 1,3, 5 to 9 by 2, 100 to 200 by 50; (values of n are: 1, 3, 5, 7, 9, 100, 150, and 200)??

If you use character values in the DO statement, the length of the first character value determines the storage length of the index variable. Therefore, you may need to use a LENGTH statement to set the storage length for the index variable. Using character values for DO loop indices can be especially useful, as illustrated by this example.?

data easyway;?
do Group = 'Placebo','Active';?
do Subj = 1 to 5;?
?input Score @;?
?output;?
end;?
end;?
datalines;?
?250 222 230 210 199?
?166 183 123 129 234?
;
run;

title 'Nesting of Loops';

proc print data= easyway;        

run;
        

This program demonstrates two things: first, you can use character values in a DO loop, and second, you can nest one DO loop inside another. The outer loop first sets Group equal to Placebo.

No alt text provided for this image

Next, the inner loop iterates five times, reading score values and outputting an observation each time. When the inner loop is finished, control returns to the top of the outer loop where Group is set to Active. Five more?values of Score are read and five observations are written to the SAS data set. Here is a listing of data set Easyway:?


In the next article on Iterative Processing, I discuss DO WHILE statements, and DO UNTIL statements.?

Till then happy learning!



Abhinav Reddy

Senior SDET Engineer @ Delta Dental | Selenium | Cypress | Playwright | Java | JavaScript | TypeScript | BDD Cucumber | AWS | Azure | MySQL | RestAssured | Postman | Appium | JMeter

3 年

Please also share daily posts of Python ????

回复

要查看或添加评论,请登录

Sunayana Pati的更多文章

  • Working with Dates: Part 2

    Working with Dates: Part 2

    In continuation from the previous article on working with dates, today's article will cover various date functions in…

  • Working with Dates: Part 1

    Working with Dates: Part 1

    Most data sets contain date information, such as date of birth or transaction date. SAS can read dates in almost any…

  • Performing Iterative Processing: Part 2

    Performing Iterative Processing: Part 2

    In continuation from the previous article on iterative processing, today's article will cover DO UNTIL statements and…

  • Performing Conditional Processing

    Performing Conditional Processing

    This article describes the tools that allow programs to “make decisions” based on data values. For example, you may…

  • Reding data from Excel files

    Reding data from Excel files

    It is quite common to be given a Microsoft Office Excel spreadsheet as your data source. Luckily, SAS has several…

  • Storing user-defined formats permanently

    Storing user-defined formats permanently

    If you have user-defined formats with permanent SAS data sets, it is important to make your formats permanent also…

  • Formats and Labels in SAS

    Formats and Labels in SAS

    LABELS If you are using SAS to produce listings and reports for others, you will want to make the output more readable…

  • SAS Libraries

    SAS Libraries

    When you write a DATA statement such as data test; SAS creates a temporary SAS data set called test. When you close…

  • Reading data using formatted inputs

    Reading data using formatted inputs

    Till now we have been reading data from external files; we used DATALINES to place data directly into the SAS program…

  • Reading data from Fixed-columns

    Reading data from Fixed-columns

    You know, how to read data from external files. Today I discuss how to read data from fixed columns.

社区洞察

其他会员也浏览了