Research – Exascallab

We work on a number of research areas related to High Performance Computing and Big Data / Analytics:

OpenMP

OpenMP is a directive-based approach to parallel programming.

Smart Integrated Tuning of Parallel Code for Multicore and Manycore Systems: This NSF-funded project explores applying AI, particularly Graph Neural Networks, to accelerate compiler optimization and autotuning for heterogeneous architectures and improve tensor program generation across diverse hardware. Read more …
Machine Learning Support for Deploying Programs Across Heterogeneous Computers: In this NSF-funded project “Performance Portable Parallel Programming On Extremely Heterogeneous Systems (P4EHS)”, we seek to make it easier to use these powerful computers by applying Machine Learning techniques to identify and extract code regions (kernels) that are suitable for execution on a given device. The resulting output program, which may include target device and target data information, will be expressed using features of OpenMP.

Ookami

Ookami is a computer technology testbed supported by the National Science Foundation (NSF).

Ookami

Total Project

This TOTAL-funded project explores the tasking constructs provided by the intra-node OpenMP programming interface and evaluates their usefulness for scientific and technical applications to task-based code versions, which are of strong interest to TOTAL, and considers how this approach may be extended for use beyond a single node of an HPC system. Read more …

Tensor Compiler

The Tensor or Deep Learning (DL) compilers take the DL model definitions described in the DL frameworks like TensorFlow and PyTorch as inputs (computation graph) and generate efficient code implementations on various AI hardware. Read more …

Recent Past Projects

OpenMP

OpenMP is a directive-based approach to parallel programming.

SOLLVE: Scaling OpenMP with LLVM for Exascale Performance and Portability. Read more …
Program transformation for automatic offloading with OpenMP: This project aims at transforming the code of an application to support GPU offloading. We propose a compiler analysis pass which detects suitable regions of code and a transformation pass to facilitate automatic GPU offloading of those regions. We also propose to design a cost model for GPU offloading which along with the running cycles on the GPU also considers the data transfer time between the CPU and the GPU. Read more …
OpenMP on the Emu Architecture: This work focuses on providing OpenMP on Emu’s unique architecture. To achieve this, we first aim to provide a translation of OpenMP to the Tapir IR-based parallelism construct, developed at MIT. Thereafter, we attempt to translate Tapir based IR to OpenMP runtime calls, which are to be further ported to emit the Emu ISA. Read more …

OpenACC

OpenACC is a directive-based approach to parallel programming intended for accelerators such as GPUs.

OpenACC

OpenSHMEM

OpenSHMEM is a Partitioned Global Address Space library for fast communication and computation overlap.

OpenSHMEM: Developing the reference implementation of the OpenSHMEM standard, and proposing extensions and new ideas. Read more …

CAARES: Cross-layer Application-Aware Resilience at Extreme Scale

Big Data

CREDIT: CREDIT is an interdisciplinary program to accelerate research and education in predictive analytics for science and engineering to transform our ability to effectively address and solve many complex problems posed by big data. The goal is to aid military analysts in efficiently handling and processing large volumes of complex data from multiple sources. The program will support mission-critical applications of the US Department of Defense.
CREDIT was established at Prairie View A&M University and is directed by Dr. Lijun Qian. It includes collaborators at Stony Brook University as well as several government research labs. It is funded by the Department of Defense. Read more …