codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[largescale-sig][nova][neutron][oslo] RPC ping


On 7/28/20 4:11 PM, Ken Giusti wrote:
> 
> 
> On Tue, Jul 28, 2020 at 4:48 AM Bogdan Dobrelya <bdobreli at redhat.com 
> <mailto:bdobreli at redhat.com>> wrote:
> 
>     On 7/27/20 7:08 PM, Dan Smith wrote:
>      >> Tagging with Nova and Neutron as they are mentioned and I
>     thought some
>      >> people from those teams had opinions on this.
>      >
>      > Nova already implements ping() on the compute RPC interface, which we
>      > use to make sure compute waits to start up until conductor is
>     available
>      > to do its bidding. So if a new obligatory RPC server method is
>     actually
>      > added called ping(), it will break us.
>      >
>      >> Can you refresh my memory on why we dropped this before? I recall
>      >> talking about it in Denver, but I can't for the life of me remember
>      >> what the conclusion was. Did we intend to use something else for
>     this
>      >> that has since fallen through?
>      >
>      > The prior conversation I recall was about helm sitting on our bus to
>      > (ab)use our ping method for health checks:
>      >
>      >
>     https://opendev.org/openstack/openstack-helm/commit/baf5356a4fb61590a95f64a63c0dcabfebb3baaa
>      >
>      > I believe that has since been reverted.
>      >
>      > The primary concern was about something other than nova sitting
>     on our
>      > bus making calls to our internal services. I imagine that the
>     proposal
>      > to bake it into oslo.messaging is for the same purpose, and I'd
>     probably
>      > have the same concern. At the time I think we agreed that if we were
>      > going to support direct-to-service health checks, they should be
>     teensy
>      > HTTP servers with oslo healthchecks middleware. Further loading down
>      > rabbit with those pings doesn't seem like the best plan to
>      > me. Especially since Nova (compute) services already check in
>     over RPC
>      > periodically and the success of that is discoverable en masse through
>      > the API.
> 
>     Having RPC ping in the common messaging library could improve aliveness
>     handling of long-running APIs, like listing multiple Neutron ports or
>     Heat objects with full details, or running some longish Mistral
>     workflow
>     maybe. Indeed it should be made not breaking things already existing in
>     Nova ofc.
> 
> 
> Not sure this is related to your concern about long running API's but 
> O.M. has an optional RPC call heartbeat monitor that verifies the 
> connectivity to the server while the call is in progress.  See the 
> description of call_monitor_timeout in the RPC client docs [0].

Correct, but heartbeats didn't show off as a reliable solution. There 
were WSGI & eventlet related issues [1] with running heartbeats. I can't 
recall that was the final outcome of that discussion and what was the 
fix. So relying on explicit pings sent by clients could work better perhaps.

[1] https://bugs.launchpad.net/tripleo/+bug/1829062

> 
> 0: https://docs.openstack.org/oslo.messaging/latest/reference/rpcclient.html
> 
> 
> 
>      >
>      > --Dan
>      >
> 
> 
>     -- 
>     Best regards,
>     Bogdan Dobrelya,
>     Irc #bogdando
> 
> 
> 
> 
> -- 
> Ken Giusti  (kgiusti at gmail.com <mailto:kgiusti at gmail.com>)


-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando