codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[nova] MessageUndeliverable from nova-conductor and MessagingTimeout from nova-compute


It's not about connection, but message transmitting.
Title is changed.
I am having rabbitmq 3.8.4 and nova 21.0.0.

When I check rabbitmq queues for compute, I see 3 messages in
compute.compute-1 and 0 messages in other compute node queues.
In nova-conductor log of all 3 instances, I see 
"oslo_messaging.exceptions.MessageUndeliverable".
In nova-compute log on compute-1, I see
"Timed out waiting for nova-conductor."
Actually, 4 out of 5 compute nodes have such warning.
One compute node has exception 
"oslo_messaging.exceptions.MessagingTimeout".

"rabbitmqctl list_bindings | grep compute-1" shows this.
=================================
        exchange        compute.compute-1       queue   compute.compute-1       []
nova    exchange        compute.compute-1       queue   compute.compute-1       []
=================================
Is this some known issue? How did it happen?
What's the cause of it and any way to prevent it from happening?


Thanks!
Tony
> -----Original Message-----
> From: Arnaud Morin <arnaud.morin at gmail.com>
> Sent: Saturday, November 14, 2020 2:10 AM
> To: Tony Liu <tonyliu0592 at hotmail.com>
> Cc: OpenStack Discuss <openstack-discuss at lists.openstack.org>
> Subject: Re: [nova-compute] not reconnect to rabbitmq?
> 
> Hello,
> 
> What we noticed in our case is that nova compute is actually
> reconnecting, but cannot communicate with the conductor because the
> queue binding is either absent or not working anymore.
> 
> So, first, which version of nova are you running?
> Which version of rabbitmq? (some bugs related to shadow bindings are
> fixed after 3.7.x / dont remember x)
> 
> Can you check if you have any queue related to your compute?
> something like that:
> rabbtitmqctl list_queues | grep mycompute
> 
> Also check the bindings, better using the management interface or
> rabbitmqadmin:
> rabbitmqadmin list bindings | grep mycompute
> 
> What usually fixed our issue by the past was to delete / recreate the
> binding (easy to do from the management interface).
> 
> Cheers,
> 
> --
> Arnaud Morin
> 
> On 14.11.20 - 00:34, Tony Liu wrote:
> > Hi,
> >
> > I'm having a deployment with Ussuri on CentOS 8.
> > I noticed that, in case the connection from nova-compute to rabbitmq
> > is broken, nova-compute doesn't reconnect.
> > I checked nova-conductor who seems keeping trying reconnect to
> > rabbitmq when connection is broken. But nova-compute doesn't seem
> > doing the same. I've seen it a few times, after I fixed rabbitmq and
> > bring it back, nova-conductor gets reconnected, but nova-compute
> > doesn't, I have to manually restart it. Anyone else has the similar
> > experiences?
> > Anything I am missing?
> >
> >
> > Thanks!
> > Tony
> >
> >