[cinder] Ceph, active-active and no coordination
On Wed, 18 Nov 2020 at 09:33, Gorka Eguileor <geguileo at redhat.com> wrote:
> On 17/11, Giulio Fidente wrote:
> > I am leaving some comments inline and adding on CC some cinder folks who
> > know better
> > On 11/17/20 9:27 PM, RadosÅ?aw Piliszek wrote:
> > > Dear Cinder Masters,
> > >
> > > I have a question for you. (or two, or several; well, actually the
> > > whole Kolla team has :-) )
> > >
> > > The background is that Kolla has been happily deploying multinode
> > > cinder-volume with Ceph RBD backend, with no coordination configured,
> > > cluster parameter unset, host properly set per host and backend_host
> > > normalised (as well as any other relevant config) between the
> > > cinder-volume hosts.
> > >
> > > The first question is: do we correctly understand that this was an
> > > active-active deployment? Or really something else?
> That is an Active-Active deployment with an Active-Passive
> configuration, so it's a PROBLEM waiting to happen.
> Besides races that could happen in the code because there is no
> coordinator configured (this is less of an issue for the RBD driver than
> for other drivers, but it's still an issue), there's also a problem
> whenever a cinder-volume service starts.
> Any time a cinder-volume service starts it will mess many of the
> resources that are being worked on (those with status ending in 'ing',
> such as 'creating') because it thinks those are resources that it left
> in that state and need to be cleaned.
> My recommendation is to do this right: configure the cluster option,
> remove the backend_host, and configure the coordinator. Upgrading a
> deployment from that configuration to clustered is relatively easy, we
> just need to leave one of the cinder-volume services with the
> backend_host as it was before; that way when it starts it will
> automatically migrate all the resources from being non-clustered to
> being clustered (an alternative would be to add this command to
> cinder-manage, because I don't think the current "cinder-manage cluster
> rename" will work).
> If deploying the coordinator is an issue, we should at least do the
> other 2 steps. That way we'll get rid of the cleanup issue even if we
> still have the race conditions.
Thank you for the recommendation, Gorka. I think we are now agreed
that this is the right way forward - set cluster, and strongly
recommend a coordinator (we configure one already if the user has
enabled etcd or redis).
We identified 3 cases to consider.
1. Train deployments with backend_host configured as per documentation
2. Ussuri deployments with backend_host configured, due to upgrade from Train.
3. Ussuri deployments without backend_host configured, either due to a
fresh Ussuri deployment or a Train upgrade followed by removal of
documented backend_host configuration in favour of new Kolla Ansible
We need to consider both minor and major upgrades that apply whatever
changes we come up with.
> > this configuration is similar to that deployed by tripleo, except
> > tripleo would use pacemaker to have always a single cinder-volume running
> > the reason being that, as far as I understand, without a coordinator the
> > first cinder-volume within a given 'backend_host' group to consume the
> > message from the amqp queue will start executing the task ... so if
> > another task is queued (or is in progress), for the same volume, there
> > is risk of data corruption
> > > Now, there have been no reports that it misbehaved for anyone. It
> > > certainly has not for any Kolla core. The fact is it was brought to
> > > our attention because due to the drop of Kolla-deployed Ceph, the
> > > recommendation to set backend_host was not present and users tripped
> > > over non-uniform backend_host. And this is expected, of course.
> > >
> > > The second and final question is, building up on the first one, were
> > > we doing it wrong all the time?
> > > (plus extras: Why did it work? Were there any quirks? What should we do?)
> > I think the correct setup for active/active should be
> > - do not use same host or backend_host
> > - do set cluster to same value across cluster members
> > - use a coordinator
> > > PS: Please let me know if this thought process is actually
> > > Ceph-independent as well.
> > I don't think it's Ceph dependent, my understanding is that
> > active/active is only possible with some drivers because not every
> > driver is safe to use in active/active configuration; some can, for
> > example, have issues handling the database
> > Ceph is just one of those drivers which behaves correctly in
> > active/active configuration
> > --
> > Giulio Fidente
> > GPG KEY: 08D733BA