GMH (GPU Message Handler) : A GPU Message Passing Toolkit April 23, 2010, Version 0.90 GMH toolkit provides not only high performance MPI style communication primitives but also an event flow programming framework to ease the development effort for parallel iterative numerical solvers. At present, it is the only known GPU message passing toolkit that enables applications to transfer data among GPUs on just one host as well as within a cluster. In addition, it is the only known toolkit that allows applications mixing with any thread safe MPI implementation. It utilizes a thread group on a single host as a way to map GPU resources to an MPI rank, eliminates host memory copies when transferring data among GPUs on a single host, and introduces the notion of ”°virtual GPU”± as a way to bind a thread to a GPU automatically when there are multiple GPUs on a single host. In addition, GMH delivers high point-to-point data transfer bandwidth only limited by the underlying GPU to CPU memory transfer bandwidth, and offers low pointto- point transfer and collective communication latency only restrained by the internal GPU-CPU memory transfer latency. More importantly, GMH offers flexible programming interfaces such that developments can either stick with the conventional MPI programming style or focus on the event flow of applications, which can be managed and executed by GMH at run time. See INSTALL file for installation instructions. Under the distribution, the source code is in the main directory. The test directory contains most of micro-benchmark programs such as latency and bandwidth tests. The dslash directory contains a physics parallel dslash numerical benchmark example. Send bug reports and comments to chen@jlab.org