We present a user-level message interface that provides high performance and very low processor overhead. In this system, messages are launched from within the user's general register file, and received in a hardware queue mapped to a general register. A message handler is started within the latency of a jump instruction upon arrival of the first message word, up to 18x faster than conventional interrupt-driven interfaces. These tightly-integrated mechanisms feature end-to-end latencies as low as nearly one quarter that of memory-mapped interfaces. Copying is eliminated in these mechanisms, reducing processor occupancy by up to one-third when compared with other integrated register-mapped systems. The interface also features a low overhead, robust protection model using virtually-addressed message destinations and certified trusted handlers . Both are specified with unforgeable pointers, enabling fine-grain control over a user thread's accessible domain as well as its permitted remote operations. We discuss the application of this system in the framework of the multithreaded MIT M-Machine, and show that unlike other approaches, it is able to provide protection and avoid starvation while maintaining high efficiency.