Heterogeneous System Architecture (HSA) is a hardware architecture for heterogeneous computing proposed by the HSA Foundation. Its Unified Memory Architecture (hUMA) enables data sharing between heterogeneous devices and its user-level Queuing Model (hQ) enables low overhead kernel launching. With such features, applications could enjoy more efficient and effective heterogeneous computing. However, most of today's heterogeneous-computing applications have not leveraged the hUMA and hQ features. Moreover, the majority of applications on the market are implemented in traditional sequential models.
This paper looks at building a fully automatic framework to migrate sequential applications to HSA. The framework includes polyhedral-guided memory aliasing analysis, a staged dispatching predictor, and memory coalescing optimization. It also takes advantages of hUMA and hQ to achieve low overhead job dispatching on HSA-compliant systems. On an AMD Carrizo machine (HSA-compliant), a sequential application runs through our framework could be 8.66x faster on Carrizo than before. In several cases where workloads are considered insufficient to benefit from conventional or non-HSA heterogeneous computing, our framework could still deliver significant speedups. In addition, the performance obtained through our framework can sometimes exceed the performance gain from manual tuning for both HSA and non-HSA platforms, running on the same Carrizo machine. With this framework, many existing applications coded in traditional sequential models could get performance boost from HSA-based heterogeneous computing.
02-33664888 ext. 404