Subscripted subscript patterns

Analysis and Automatic Parallelization

Project maintained by akshay9594 Hosted on GitHub Pages — Theme by mattgraham

The Project

A number of scientific applications comprise of loops wherein an array is subscripted by another array - a[b[i]]. With write references to the host array (array ‘a’) within a loop, current compile-time techniques are incapable of detecting such loops as parallelizable. If left unparallelized, these loops can in-turn prevent the performance obtained through automatic parallelization, matching that of the hand parallelized version. Hence, Subscripted subscript analysis is the next big challenge in Automatic Parallelization.

Here is a short video describing the project and the current progress, published in proceedings of the International Conference on Supercomputing (ICS) ‘21:


The Project has been divided into the following stages:

Stage-1 : Analysis of Subscripted subscript Patterns

In this stage, we go through several Benchmark suites - NAS Parallel Benchmarks, SuiteSparse, SPEC CPU 2006 etc. and try to analyze the subscripted subscript patterns inorder to come up with properties that characterize the subscript array. These properties aid in parallelizing the loops. Injectivity and Monotonicity were the most important and the most pervasive properties that we came across in our analysis. In addition to the aforementioned properties, we have also discovered certain complex properties that require advanced symbolic analysis.

Following is an example of a subscripted subscript pattern:

  for(i = 0 ; i < n; i++){
      a[b[i]] = expr;

Above loop is parallelizable if array “b” is injective.

Stage-2 : Compile-time Algorithm for Subscripted-subscripts

After having completed the analysis of the subscript array patterns in the benchmark codes, we propose a compile time algorithm based on Symbolic Range Aggregation that will help us detect some of the properties discovered in stage 1 and help prove parallelism. We have applied our algorithm by hand to example loops from real scientific application. The algorithm and performance results have been described in detail in our ICS’21 paper below. Currently the algorithm can handle a class of subscripted subscripts.

Refer to the following publications for more details:


Stage-3 : Implementation of the techniques inside Cetus

Our eventual goal is to implement the symbolic analysis techniques proposed in stage 2 inside the Cetus source to source compiler infrastructure. The implemented technique would enable analysis and parallelization of the subscripted-subscript patterns within benchmark codes identified in stage 1. We will also be extending the Data Dependence Test in Cetus to disprove dependencies in loops with subscript arrays in the presence of a property, and parallelize the enclosing loops.

Benchmark Suites used for analysis:

Here are some popular benchmark suites wherein you can find loops with subscripted-subscript patterns: