[largescale-sig][nova][neutron][oslo] RPC ping


On 7/27/20 7:08 PM, Dan Smith wrote:
> The primary concern was about something other than nova sitting on our
> bus making calls to our internal services. I imagine that the proposal
> to bake it into oslo.messaging is for the same purpose, and I'd probably
> have the same concern. At the time I think we agreed that if we were
> going to support direct-to-service health checks, they should be teensy
> HTTP servers with oslo healthchecks middleware. Further loading down
> rabbit with those pings doesn't seem like the best plan to
> me. Especially since Nova (compute) services already check in over RPC
> periodically and the success of that is discoverable en masse through
> the API.
> --Dan

While I get this concern, we have seen the problem described by the 
original poster in production multiple times: nova-compute reports to be 
healthy, is seen as up through the API, but doesn't work on any messages 
A health-check going through rabbitmq would really help spotting those 
situations, while having an additional HTTP server doesn't.

Have a nice day,

Johannes Kulik
IT Architecture Senior Specialist
Johannes Kulik