Simon Moll

Introduction

LLVM-VE-RV 1.5.1 is a new development release of the LLVM-VE compiler for SX-Aurora. This release offers the following key features:

  • Vector code generation and the ability to leverage LLVM’s vectorizers for automatic vectorization.
  • Early support for VH-to-VE OpenMP target offloading for C programs, based on sotoc by RWTH Aachen.
  • Outer-loop vectorization with the Region Vectorizer.
  • LLVM-VE-RV also supports the VEL intrinsics of LLVM-VE (compile with -mvelintrin).

Loop Vectorization with LLVM

This is the first time that we have automatic loop vectorization in LLVM working for SX-Aurora. Since this is automatic, all you have to do is compile your code using:

clang(++) --target=ve-linux -O3 [..]

Region Vectorizer for LLVM

This release is also bundled with the Region Vectorizer (RV) for outer-loop vectorization. The Region Vectorizer is able to vectorize some loops that LLVM cannot handle. In order to apply RV to a loop, annotate the loop with #pragma omp simd and use rvclang(++) instead of clang(++) to load the Region Vectorizer plugin into the compiler.

rvclang -fopenmp-simd --target=ve-linux -O3 <prog.c> -o <prog.o>

Preview feature: OpenMP target offloading (sotoc)

This release comes with a preview feature for OpenMP target offloadig from the VH to VE. The underlying method (sotoc) has been developer by RTWH Aachen and leverages VEOffload to perform the actual offloading.

To offload a parallel loop to the VE, use OpenMP pragmas as shown here for a saxpy kernel:

#pragma omp target \
        map(to     : x[0:m]) \
        map(tofrom : y[0:m])
#pragma omp parallel for
for (int i = 0; i < m; ++i) {
  y[i] = a * x[i] + y[i];
}

Compile the code with the following flags:

clang -fopenmp -fopenmp-targets=aurora-nec-veort -O3 <your_code.c> -o 

This offloading prototype uses the ncc compiler to compile the VE kernels for offloading. To use clang with the VE backend instead, also pass the option -fopenmp-nec-compiler=clang|rvclang to compile the VE code with clang or rvclang.

VELIntrinsics

LLVM-VE-RV also supports VEL Intrinsics for low-level vector programming. Add the compiler option -mvelintrin to enable VEL Intrinsics. If you don’t, any VEL intrinsics that involve vector mask registers will lead to a compiler crash. Object files compiled with and without the -mvelintrin flag are compatible.

Installation

Pre-build RPMs are available from github.

Building from source

To build the llvm-ve-rv compiler from source, i recommend the automatic build scripts in llvm-dev. Clone llvm-dev with:

git clone -b hpce/release_1.5.1 gh:sx-aurora-dev/llvm-dev.git

Clone all source repositories

. llvm-dev/clone.sh https://github.com/sx-aurora-dev hpce/release_1.5.1

and build llvm-ve-rv 1.5.1 using

bash llvm-dev/build-and-install.sh