[largescale-sig][nova][neutron][oslo] RPC ping
On 7/28/20 4:11 PM, Ken Giusti wrote:
> On Tue, Jul 28, 2020 at 4:48 AM Bogdan Dobrelya <bdobreli at redhat.com
> <mailto:bdobreli at redhat.com>> wrote:
> On 7/27/20 7:08 PM, Dan Smith wrote:
> >> Tagging with Nova and Neutron as they are mentioned and I
> thought some
> >> people from those teams had opinions on this.
> > Nova already implements ping() on the compute RPC interface, which we
> > use to make sure compute waits to start up until conductor is
> > to do its bidding. So if a new obligatory RPC server method is
> > added called ping(), it will break us.
> >> Can you refresh my memory on why we dropped this before? I recall
> >> talking about it in Denver, but I can't for the life of me remember
> >> what the conclusion was. Did we intend to use something else for
> >> that has since fallen through?
> > The prior conversation I recall was about helm sitting on our bus to
> > (ab)use our ping method for health checks:
> > I believe that has since been reverted.
> > The primary concern was about something other than nova sitting
> on our
> > bus making calls to our internal services. I imagine that the
> > to bake it into oslo.messaging is for the same purpose, and I'd
> > have the same concern. At the time I think we agreed that if we were
> > going to support direct-to-service health checks, they should be
> > HTTP servers with oslo healthchecks middleware. Further loading down
> > rabbit with those pings doesn't seem like the best plan to
> > me. Especially since Nova (compute) services already check in
> over RPC
> > periodically and the success of that is discoverable en masse through
> > the API.
> Having RPC ping in the common messaging library could improve aliveness
> handling of long-running APIs, like listing multiple Neutron ports or
> Heat objects with full details, or running some longish Mistral
> maybe. Indeed it should be made not breaking things already existing in
> Nova ofc.
> Not sure this is related to your concern about long running API's but
> O.M. has an optional RPC call heartbeat monitor that verifies the
> connectivity to the server while the call is in progress.Â See the
> description of call_monitor_timeout in the RPC client docs .
Correct, but heartbeats didn't show off as a reliable solution. There
were WSGI & eventlet related issues  with running heartbeats. I can't
recall that was the final outcome of that discussion and what was the
fix. So relying on explicit pings sent by clients could work better perhaps.
> 0: https://docs.openstack.org/oslo.messaging/latest/reference/rpcclient.html
> > --Dan
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
> Ken GiustiÂ (kgiusti at gmail.com <mailto:kgiusti at gmail.com>)