[largescale-sig][nova][neutron][oslo] RPC ping
On 7/29/20 12:26 AM, Dan Smith wrote:
>> Correct, but heartbeats didn't show off as a reliable solution. There
>> were WSGI & eventlet related issues  with running heartbeats. I
>> can't recall that was the final outcome of that discussion and what
>> was the fix. So relying on explicit pings sent by clients could work
>> better perhaps.
>>  https://bugs.launchpad.net/tripleo/+bug/1829062
> There are two types of heartbeats in and around oslo.messaging, which is
> why call_monitor was used for the long-running RPC thing. The bug you're
> referencing is, I believe, talking about heartbeating the api->rabbit
> connection, and has nothing to do with service-to-service pinging, which
> this thread is about.
> The call_monitor stuff Ken mentioned requires the *server* side to do
> the heartbeating, so something like nova-compute or
> nova-conductor. Those things aren't running under uwsgi and don't have
> any problems with threading to accomplish those goals.
> So, if we're talking about generic ping() to provide a robust
> long-running RPC call, oslo.messaging already does this (if you ask for
> it). Otherwise, a generic service-to-service ping() doesn't, as was
> mentioned, really mean anything at all about the ability to do
> meaningful work (other than further saturate the message bus).
Thank you for that great information Dan, Ken.
Then please disregard that mistakenly highlighted aspect. Didn't want to
derail the thread by that apparently unrelated side case. I believe the
original intention for RPC ping was to have something initated by
clients, not server-side? That may be useful when running in Kuberenetes
pod with aliveness/readiness probes set up. While the latter may be not
the best fit for RPC ping indeed, the former seems like a much better
way to check aliveness than just checking TCP connection to rabbit port?