codehaus


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CentOS unrecoverable after Ceph issues


Hello,
I do not use CentOS and XFS but I had a simillar issue after an
outrage. Ceph didnt release the lock on rados block device. You can
check if you are facing the same issue than I did. You have to shutdown
your instance then type this command:
rbd -p your-pool-name lock list instance-volume-id
The command should not return any output if your instance is shut. If
you got an output about 1 exclusive lock just remove it:
rbd -p your-pool-name lock remove instance-volume-id
Best Regards,Romain
On Tue, 2020-07-21 at 14:04 +0100, Grant Morley wrote:
>     Hi all,
>     We recently had an issue with our ceph cluster which ended up
>       going into "Error" status after some drive failures. The system
>       stopped allowing writes for a while whilst it recovered. The
> ceph
>       cluster is healthy again but we seem to have a few instances
> that
>       have corrupt filesystems on them. They are all CentOS 7
> instances.
>       We have got them into rescue mode to try and repair the FS with
>       "xfs_repair -L" However when we do that we get this:
>      973.026283]
>         XFS (vdb1): Mounting V5 Filesystem
> 
>         [ 973.203261] blk_update_request: I/O error, dev vdb, sector
>         8389693
> 
>         [ 973.204746] blk_update_request: I/O error, dev vdb, sector
>         8390717
> 
>         [ 973.206136] blk_update_request: I/O error, dev vdb, sector
>         8391741
> 
>         [ 973.207608] blk_update_request: I/O error, dev vdb, sector
>         8392765
> 
>         [ 973.209544] XFS (vdb1): xfs_do_force_shutdown(0x1) called
> from
>         line 1236 of file fs/xfs/xfs_buf.c. Return address =
>         0xffffffffc017a50c
> 
>         [ 973.212137] XFS (vdb1): I/O Error Detected. Shutting down
>         filesystem
> 
>         [ 973.213429] XFS (vdb1): Please umount the filesystem and
>         rectify the problem(s)
> 
>         [ 973.215036] XFS (vdb1): metadata I/O error: block 0x7ffc3d
>         ("xlog_bwrite") error 5 numblks 8192
> 
>         [ 973.217201] XFS (vdb1): failed to locate log tail
> 
>         [ 973.218239] XFS (vdb1): log mount/recovery failed: error -5
> 
>         [ 973.219865] XFS (vdb1): log mount failed
> 
>         [ 973.233792] blk_update_request: I/O error, dev vdb, sector
> 0
>     Interestingly
>         any debian based instances we could recover. It just seems to
> be
>         CentOS and having XFS on CentOS and ceph the instances don't
>         seem happy. This seems more low level to me in ceph rather
> than
>         a corrupt FS on a guest.
>     Does anyone
>         know of any "ceph tricks" that we can use to try and at least
>         get an "xfs_repair" running?
>     Many thanks,
>     
> 
>       
>     -- 
> 
>       
>         Grant Morley 
> 
>         Cloud Lead, Civo Ltd
> 
>         www.civo.com  
> 
>          
> 
>        
>   
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200721/a3725126/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4843 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200721/a3725126/attachment.bin>