Developer Guide

This guide explains how APPy compiler works under the hood.

High-Level Programming Interface

APPy’s high-level programming interface is a wrapper around the low-level programming interface where the compiler performs three key automatic transformations:

  1. Data Movement: APPy generates code that moves data between the CPU and the GPU automatically around the prange region.

  2. Reduction Detection: APPy automatically detects reductions and add corresponding pragma annotations.

  3. Inner Loop Vectorization: APPy automatically adds the simd pragma to innermost loops that can be vectorized.

Reduction Detection

To detect reductions on scalar variable x, APPy uses the following rules:

  • Check all assignments to x, x is a reduction pattern if and only if all the assignments to it has the same reduction operator. For example, all assignments to x are in the form of x = x + y.

This rule works well for detecting reductions on scalar variables (it’s sound for commonly used reduction patterns). However, it may easily have false positives for array variables.

@jit
def add_one(a):
     for i in prange(a.shape[0]):
          a[i] += 1

In the add_one example above, all assignments to a[i] are in the form of a[i] = a[i] + 1. Therefore the previous rule will detect a[i] as a reduction pattern. However, a[i] = a[i] + 1 is not a reduction pattern here, as it performs an element-wise addition on array elements.

To reduce false positives, we can check the indices of the array variable in the assignment. The previous rule is extended to distinguish reduction with regard to which loop is involved:

  • For loop loop_i with index variable i, an array store a[i0, i1, ...] is considered a reduction pattern if all the assignments to it has the same reduction operator, and i does not appear in the array store indices i0, i1, ....

In the add_one example, the loop index variables i does appear in the array store a[i], thus a[i] = a[i] + 1 is not a reduction with regard to loop i. On the other hand, a[j] += b[i,j] in the following code is a reduction with regard to loop i, but not a reduction with regard to loop j:

@jit
def foo(a):
     for i in prange(a.shape[0]):
            for j in range(a.shape[1]):
                a[j] += b[i,j]

Similarly, for the code example below with a[i] += b[i,j], a[i] will be detected as a reduction with regard to loop j, but not to loop i:

@jit
def foo(a, b):
     for i in prange(a.shape[0]):
            for j in range(a.shape[1]):
                a[i] += b[i,j]
class DetectReduction(ast.NodeVisitor):
    def __init__(self):
        self.loop_stack = []
        self.candidates = {}

    def visit_For(self, node):
        index = node.target.id
        self.loop_stack.append(index)
        # Visit sub-loops
        self.generic_visit(node)

        # Need an assignment visitor here
        self.visit_Assign(node)

        self.loop_stack.pop()

    def get_reduction_op(self, node):
        pass

    def visit_Assign(self, node):
        target = node.targets[0]
        reduce_op = self.get_reduction_op(node)
        if reduce_op is not None:
            if is_ast_name(target):
                # If is scalar, use the previous rule
                self.candidates.set_default(target.id, set()).add(reduce_op)
            elif is_array_store(target):
                # If is array, use the extended rule
                indices = get_array_store_indices(node)
                if indices == self.loop_stack:
                    # Not a reduction candidate
                    pass
                else:
                    self.candidates.setdefault(unparse(target), set()).add(reduce_op)

Low-Level Programming Interface

APPy utilizes Triton for its backend for low-level programming interface.