gRPC Troubleshooting

Here are a few quick & dirty troubleshooting tips after a couple days in the trenches with a gRPC service (C# client, C++ server):

Ensure any code creating a new alarm first cancels the last alarm it added to the queue, if it hasn't been processed. In my case, I had a server-side observer receiving updates from elsewhere in the server module & writing them to the gRPC stream. When those updates started coming too quickly, it crashed when adding the alarm to the queue.
Look for double disconnects. I had another bug where the client was getting a cancellation exception from the server on the MoveNext() call, caught it, then proceeded to try to disconnect via Cancel() on the cancellation token. The client then threw an exception: "Shutdown has already been called".
Check to see if your client is setting a deadline, which can cause it to close the connection.
Ensure your client code handles a cancellation exception coming from MoveNext().
Test by adding delays in various places. One of my crashes was a race condition and would disappear with breakpoints at different points in the process. Specifically, I think I had a handler trying to write to a stream after it was closed. I would simultaneously get a cancellation exception on the client and have a new handler added to the completion queue.
Examine keep alives. Services can be set up with keep alives sent periodically from the server. Though, if you're going down under high load, this may not be your issue.
The common advice... Enable logging:

Environment.SetEnvironmentVariable("GRPC_TRACE", "all");
Environment.SetEnvironmentVariable("GRPC_VERBOSITY", "DEBUG");
I got this going on the client, but still haven't gotten it working on the server. The client-side logs haven't been terribly helpful.

Helmet Hair