Automated methodology for production and execution of data-centric multi-level approximate equivalent applications for heterogeneous computing platforms  
Funding: Greece and the European Union (European Social Fund- ESF)
Project Summary:
The project Automated methodology for production and execution of data-centric multi-level approximate equivalent applications for heterogeneous computing platforms (MIS-5005377) studies the design of hardware accelerators that trade-off accuracy with the performance (i.e. maximum operation frequency, application's throughput, energy consumption, etc) metrics. Since the failure of Dennard scaling, energy efficiency has become a first-class design concern in computer systems. Its potential benefits go beyond reduced power demands in servers and longer battery life in mobile devices, since improving energy efficiency has become a requirement due to limits of device scaling and the well-known "dark silicon" or "power wall" problem. This project proposes a framework for exploiting the intrinsic error resilience of a large number of application domains in order to produce approximate solutions as a design alternative for energy efficient system design, trading accuracy for significant energy gains. Emphasis at this task is given to the hardware/software co-design of these hardware accelerators. Instead of well-established solutions (mainly at hardware level), the proposed framework applies a multi-level approximation technique in order to maximize the potential energy-savings of the approximate computing application with the minimum possible controllable error. The underline infrastructure for the execution of approximate kernels is a state-of-the-art many-accelerator hardware platform provided by Maxeler. This platform is employed in various application domains with computational-intensive kernels (e.g. market analysis, weather forecast, seismology, etc). The programming model for this platform considers that application's kernels for acceleration are developed as Intellectual Property (IP) cores, also known as DataFlow Engine (DFE), where inputs are fed based on a data-flow approach in order to maximize application's throughput. For the scopes of the propsoed framework we employ the Maxeler MPC-X. More precisily, the utilized platform contains 8 DFE accelerators, each of which has 48GB DRAM as LMEM. In order to enable DFE interconnection, it is realized with the MaxRing technology. The platform programming is performed with the MaxCompiler compiler), which utilizes in an optimal way the available hardware resources. Through this compiler it is possible to appropriately configure both the processing cores, as well as the memory hierarchy per DFE, based on the inherent application's requirements.

Aristotle University of Thessaloniki