Developer Guide
This guide explains how APPy compiler works under the hood.
High-Level Programming Interface
APPy’s high-level programming interface is a wrapper around the low-level programming interface where the compiler performs three key automatic transformations:
Data Movement: APPy generates code that moves data between the CPU and the GPU automatically around the
prangeregion.Reduction Detection: APPy automatically detects reductions and add corresponding pragma annotations.
Inner Loop Vectorization: APPy automatically adds the simd pragma to innermost loops that can be vectorized.
Reduction Detection
To detect reductions on scalar variable x, APPy uses the following rules:
Check all assignments to
x,xis a reduction pattern if and only if all the assignments to it has the same reduction operator. For example, all assignments toxare in the form ofx = x + y.
This rule works well for detecting reductions on scalar variables (it’s sound for commonly used reduction patterns). However, it may easily have false positives for array variables.
@jit
def add_one(a):
for i in prange(a.shape[0]):
a[i] += 1
In the add_one example above, all assignments to a[i] are in the form of a[i] = a[i] + 1.
Therefore the previous rule will detect a[i] as a reduction pattern. However, a[i] = a[i] + 1 is not a reduction pattern here, as it performs an element-wise addition on array elements.
To reduce false positives, we can check the indices of the array variable in the assignment. The previous rule is extended to distinguish reduction with regard to which loop is involved:
For loop
loop_iwith index variablei, an array storea[i0, i1, ...]is considered a reduction pattern if all the assignments to it has the same reduction operator, andidoes not appear in the array store indicesi0, i1, ....
In the add_one example, the loop index variables i does appear in the array store a[i], thus a[i] = a[i] + 1 is not a reduction with regard to loop i.
On the other hand, a[j] += b[i,j] in the following code is a reduction with regard to loop i, but not a reduction with regard to loop j:
@jit
def foo(a):
for i in prange(a.shape[0]):
for j in range(a.shape[1]):
a[j] += b[i,j]
Similarly, for the code example below with a[i] += b[i,j], a[i] will be detected as a reduction with regard to loop j, but not to loop i:
@jit
def foo(a, b):
for i in prange(a.shape[0]):
for j in range(a.shape[1]):
a[i] += b[i,j]
class DetectReduction(ast.NodeVisitor):
def __init__(self):
self.loop_stack = []
self.candidates = {}
def visit_For(self, node):
index = node.target.id
self.loop_stack.append(index)
# Visit sub-loops
self.generic_visit(node)
# Need an assignment visitor here
self.visit_Assign(node)
self.loop_stack.pop()
def get_reduction_op(self, node):
pass
def visit_Assign(self, node):
target = node.targets[0]
reduce_op = self.get_reduction_op(node)
if reduce_op is not None:
if is_ast_name(target):
# If is scalar, use the previous rule
self.candidates.set_default(target.id, set()).add(reduce_op)
elif is_array_store(target):
# If is array, use the extended rule
indices = get_array_store_indices(node)
if indices == self.loop_stack:
# Not a reduction candidate
pass
else:
self.candidates.setdefault(unparse(target), set()).add(reduce_op)
Low-Level Programming Interface
APPy utilizes Triton for its backend for low-level programming interface.