Enhancing MPI for HPC Clouds: Case Study with Amazon AWS and Microsoft Azure
DescriptionCloud Virtual Machines (VM) and instances have become a new trend of HPC and will significantly change its future. This poster will introduce our recent efforts to provide MPI support and new designs for high performance computing (HPC) cloud platforms, including Amazon AWS and Microsoft Azure.

Amazon has recently announced a new network interface named Elastic Fabric Adapter (EFA) targeted towards tightly coupled HPC workloads. In this poster, we first take a high-level view of the features, capabilities and performance of the adapter. Next, we explore how EFA's transport models such as UD (Unreliable Datagram) and SRD (Scalable Reliable Datagram) impact the design of high performance MPI libraries. A new zero-copy design will be introduced as a transfer mechanism over unreliable and order-less channels. Performance results with the new design will be presented.

Microsoft has also recently announced Azure HB and HC series VM targeted with InfiniBand for high performance computing cloud workloads. The second part of this poster will present optimizations done for both Azure HB and HC instances. The availability of a One-click easy and quick deployment scheme, with help from Microsoft Azure team, will be presented.

In remaining part of this poster, we show an in-depth performance evaluation and analysis of the new design with multiple benchmarks on both AWS and Azure.
