CARL@CUT

Research Summary

The Computer Architecture Research Laboratory (CARL@CUT) aims to pioneer innovative solutions at the intersection of hardware design and systems optimization. Our multifaceted research agenda spans high-performance computing architectures that maximize computational throughput while maintaining strict power-performance constraints, alongside novel techniques for energy efficiency that enable sustainable computing at scale. We develop advanced systems for parallel processing and programming paradigms that harness multi-core and many-core capabilities, with a particular focus on open-source RISC-V processor designs and custom extensions. Our lab is at the forefront of specialized AI/ML accelerator design, creating hardware solutions that dramatically improve inference and training performance for deep learning workloads. Additionally, we're exploring the frontier of quantum computing architectures, in particular the compiler and programming toolchains that bridge theoretical quantum algorithms with practical implementation. Through our interdisciplinary approach, we strive to address the fundamental challenges facing next-generation computing systems.

Research Areas

Computer Architecture
High-Performance Computing
Power-Performance Energy Efficiency
Systems for Parallel Processing & Programming
RISC-V Processors
AI/ML Accelerators
Quantum Computing

Research Projects

SWITCHES

SWITCHES is a Parallel Runtime system for Task-based Data-flow Execution. It is implemented for multi- and many-core processors that support a global address space (shared memory) and uses OpenMP v4.5 compiler directives for writting parallel applications. It consists a preprocessing tool (called Translator) that translates C/C++ code embedded with OpenMP directives to C/C++ pthread-based code.

SWITCHES is a task-based dataflow runtime that implements a lightweight distributed triggering system for runtime dependence resolution and uses static scheduling and compile-time assignment policies to reduce runtime overheads. Unlike other systems, the granularity of loop-tasks can be increased to favor data-locality, even when having dependences across different loops. SWITCHES introduces explicit task resource allocation mechanisms for efficient allocation of resources and adopts the latest OpenMP Application Programming Interface (API), as to maintain high levels of programming productivity. It provides a source-to-source tool that automatically produces thread-based code. Performance on an Intel Xeon-Phi shows good scalability and surpasses OpenMP by an average of 32%.

Computer Architecture Research Lab

Research Summary

Research Areas

Research Projects

SWITCHES