SAS Arrays: The Secret to Efficient Data Reporting

Understanding SAS Arrays

Overview

A SAS array is a way to group multiple variables together under a single name. This makes it easier to work with these variables, especially when you need to perform the same task on all of them. Think of it as putting several items into a single box so you can carry them all at once.

Section 1: Basics of SAS Arrays

Definition

A SAS array is a collection of variables that you can treat as a group. This allows you to perform the same operation on all the variables in the array without writing repetitive code.

Syntax

The basic syntax for creating an array in SAS is:

Array array_name {dimension} <array-elements>;
or
ARRAY array-name(*) list-of-variables;        

  • array_name is the name you give to your array.
  • {dimension} specifies the number of elements in the array.
  • <array-elements> is a list of variables to be included in the array.

Applications of SAS Arrays

  • Creating variables and performing calculations
  • Assigning initial values to an array
  • Restructuring a dataset

Example

Here's a simple example:

DATA example;
   SET input_data;
    ARRAY my_array(3) var1 var2 var3;
    DO i = 1 TO 3;
        my_array(i) = my_array(i) * 2;
    END;
RUN;        

In this example, we create an array called my_array with three variables: var1, var2, and var3. The DO loop multiplies each element by 2.

Section 2: Working with Arrays

Accessing Array Elements

Accessing elements in an array is straightforward. You use the array name and the index of the element you want to access.

DO Loops

DO loops are commonly used with arrays to iterate over each element. This is incredibly useful for tasks like data cleaning, transformation, and analysis.

Example

Here's a practical example:

DATA example;
   SET input_data;
    ARRAY my_array(3) var1 var2 var3;
    DO i = 1 TO 3;
        IF my_array(i) < 0 THEN my_array(i) = .;
    END;
RUN;        

This code replaces negative values in the array with missing values.

Use Cases of SAS Arrays

Example 1: Calculating Percentages

data test1;
  set testdata;
    array sales {4} qtr1 qtr2 qtr3 qtr4;
    array pct{4};
    total = sum(of sales{*});
    do i=1 to dim(sales);
        pct{i} = sales{i} / total;
    end;
run;        

Explanation:

  • array sales {4} qtr1 qtr2 qtr3 qtr4; creates an array named sales with four elements: qtr1, qtr2, qtr3, and qtr4.
  • array pct{4}; creates an array named pct with four elements (uninitialized in this example).
  • total = sum(of sales{*}); calculates the sum of all elements in the sales array using the OF operator.
  • do i=1 to dim(sales); iterates over each element of the sales array.
  • pct{i} = sales{i} / total; calculates the percentage for each element and stores it in the corresponding element of the pct array.

Understanding the OF Operator:

The OF operator in sum(of sales{*}) allows you to reference all elements in the array sales. It simplifies operations like summing all elements in the array.

Example 2: Comparing Values with Targets

data test;
set testdata;
    array sales{*} qtr1-qtr4;
    array diff{4};
    array target{4} _TEMPORARY_ (12,18,17,15);
    do i=1 to dim(sales);
        diff{i} = sum(sales{i}, -target{i});
    end;
run;        

Explanation:

  • array sales{*} qtr1-qtr4; creates an array named sales with elements qtr1 to qtr4.
  • array diff{4}; creates an array named diff with four elements (uninitialized in this example).
  • array target{4} TEMPORARY (12,18,17,15); creates a temporary array named target with initial values (12, 18, 17, 15).
  • do i=1 to dim(sales); iterates over each element of the sales array.
  • diff{i} = sum(sales{i}, -target{i}); calculates the difference between the corresponding elements of sales and target and stores it in the diff array.

Example 3: Using the IN Operator

data example;
   array num_array[5] _temporary_ (1, 2, 3, 4, 5);
    value_to_check = 3;
    do i = 1 to dim(num_array);
        if value_to_check in num_array then do;
            result = 'Yes';
            leave;
        end;
        else result = 'No';
    end;
    drop i;
run;        

Explanation:

  • array num_array[5] temporary (1, 2, 3, 4, 5); creates a temporary array named num_array with initial values (1, 2, 3, 4, 5). The _temporary_ keyword indicates that the array exists only for the duration of the DATA step and is not written to the output dataset.
  • do i = 1 to dim(num_array); iterates over each element of the num_array. The DIM function is used to determine the number of elements in the array.
  • if value_to_check in num_array then do; checks if value_to_check is in the num_array and sets the result accordingly.

The IN operator checks if a value is in a list of values. In the context of arrays, it checks if a variable's value is one of the elements in the array.
The LEAVE statement exits the DO loop immediately. It is useful when you have found the desired result and do not need to continue the loop.

Thank you for taking the time to read the full article. I hope you found it insightful and enjoyable. If you liked it, please leave a like and share your thoughts in the comments. Your feedback means a lot!

Feel free to reach out!

Get in Touch

- ?? Email: [email protected]

- ?? LinkedIn: https://www.dhirubhai.net/in/rakesh-pati-050492167/


Mayur Chaudhari

Clinical Data Engineer-1 || Parexel || CERTIFIED GLOBAL BASE SAS PROGRAMMER 9.4

1 个月

Insightful

回复
P Balakrishna Patro

Serving notice period | Immediate joiner | PMS | QARA (Quality assurance & Regulatory Affairs) | EU MDR | Power BI, TrackWise, SAP| ISO-13485 | 21 CFR Part 820 | ISO 14971| Medical device QA/RA |Risk Management |

1 个月

Insightful

回复
P Balakrishna Patro

Serving notice period | Immediate joiner | PMS | QARA (Quality assurance & Regulatory Affairs) | EU MDR | Power BI, TrackWise, SAP| ISO-13485 | 21 CFR Part 820 | ISO 14971| Medical device QA/RA |Risk Management |

1 个月

Greatjob Rakesh Pati

回复

要查看或添加评论,请登录

Rakesh Pati的更多文章

社区洞察

其他会员也浏览了