Clear stuck queue?/Remove all tombstones and re-fsck

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Sun Jul 24, 2011 10:43 am

I've been stuck with the below state for days. I've restarted the system and Greyhole and canceled and restarted the fsck task multiple times, but there are still files that don't get out of the queue. I'm also missing a lot of tombstones that aren't being regenerated.

Code: Select all

Greyhole Work Queue Statistics ============================== This table gives you the number of pending operations queued for the Greyhole daemon, per share. Write Delete Rename Repair (fsck) Archive 0 0 0 0 Backup 0 0 0 0 Books 0 0 0 0 Docs 0 0 0 0 Music 10 0 0 0 Photos 5 1 2 0 Software 0 0 0 0 TimeMachine 882 0 0 0 TimeMachine_DH 7448 3261 51 0 Videos 228 0 0 0 VirtualMachines 426 13 29 0 =============== Total 8999 3275 82 0 The following is the number of pending operations that the Greyhole daemon still needs to parse. Until it does, the nature of those operations is unknown. Spooled operations that have been parsed will be listed above and disappear from the count below. Spooled 65949
Is there an easy way to either completely clear this queue and have it start over or just remove ALL tombstones in one shot and have Greyhole rebuild them all from scratch? I guess I actually need to do both.

User avatar
lrevxl
Posts: 82
Joined: Fri Mar 04, 2011 7:23 pm
Location: Chicago, IL, USA
Contact:

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby lrevxl » Mon Jul 25, 2011 6:20 pm

It's possible to remove all tombstones and regenerate them, but I don't precisely see how that's going to fix your issues. What's going on in your Greyhole logs? Do you see file operations going through? What do you mean tombstones are 'missing'? Are you seeing errors in the logs? Did you delete the tombstones?

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Mon Jul 25, 2011 6:31 pm

I haven't done anything yet.

I've got files (about 100 of them) that appear in the graveyard on my pool disks, but I can't actually access the files from the shares (they don't show up there).

My queue currently looks pretty similar to the one I posted above (over 24 hours ago) and the current status for greyhole is that it's optimizing MySQL tables.

My logs have all kinds of errors in them. There were some hardware issues that should be fixed, but I need to get my greyhole database and pointers in order now. I literally have no idea what to do and I've lost close to a week to this and I really need to get my machine working properly again.

User avatar
lrevxl
Posts: 82
Joined: Fri Mar 04, 2011 7:23 pm
Location: Chicago, IL, USA
Contact:

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby lrevxl » Mon Jul 25, 2011 7:11 pm

Are the corresponding files also on the pool drives?

Are you certain Greyhole is actually running? I've never seen the optimizing tables take more than a few seconds. What do you see if you check on the status of the service? as root -- `service greyhole status`

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Mon Jul 25, 2011 7:42 pm

It says greyhole is running, but now the queue is empty (wasn't when I wrote the last reply) and the status is still optimizing the MySQL tables.

A quick spot check seems to show the files in the gh folder on pool drives.

User avatar
lrevxl
Posts: 82
Joined: Fri Mar 04, 2011 7:23 pm
Location: Chicago, IL, USA
Contact:

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby lrevxl » Tue Jul 26, 2011 4:57 am

Ah, you're running greyhole --status, right? That simply tells you what the last logged action was. If you were to look at /var/log/greyhole.log you'd see a lot of 'Nothing to do... Sleeping.' But your files are there and your queue is empty, so you're good now, right?

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Tue Jul 26, 2011 5:31 am

The queue is mostly empty and my files are in the pool drives, but there are no pointers to many of the files in the landing zone. I can't access them via the shares.

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Tue Jul 26, 2011 5:34 am

Also, I'm noticing a lot of input/output errors in the greyhole log.

Code: Select all

PHP Warning (2): mkdir(): Input/output error in /usr/bin/greyhole on line 2600 PHP Warning (2): mkdir(): Input/output error in /usr/bin/greyhole on line 2604

User avatar
lrevxl
Posts: 82
Joined: Fri Mar 04, 2011 7:23 pm
Location: Chicago, IL, USA
Contact:

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby lrevxl » Tue Jul 26, 2011 6:30 am

Also, I'm noticing a lot of input/output errors in the greyhole log.

Code: Select all

PHP Warning (2): mkdir(): Input/output error in /usr/bin/greyhole on line 2600 PHP Warning (2): mkdir(): Input/output error in /usr/bin/greyhole on line 2604
That looks like you're having hardware issues. Take a look at your /var/log/messages, I'm guessing you'll have an unpleasant surprise waiting for you there.

Where is your landing zone located? Putting two and two together, the fact that you're getting php i/o errors and you have no symlinks being created in the LZ, I'm guessing whatever drive the LZ is on is having issues.

AndyNJ
Posts: 38
Joined: Tue Feb 15, 2011 8:43 am

Re: Clear stuck queue?/Remove all tombstones and re-fsck

Postby AndyNJ » Tue Jul 26, 2011 7:31 am

You may be right, here's a section of the log:

Code: Select all

Jul 26 09:52:41 donbot kernel: [164377.821100] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:52:41 donbot kernel: [164377.821111] ata4.00: irq_stat 0x40000001 Jul 26 09:52:41 donbot kernel: [164377.821120] ata4.00: failed command: READ DMA EXT Jul 26 09:52:41 donbot kernel: [164377.821136] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:52:41 donbot kernel: [164377.821140] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:52:41 donbot kernel: [164377.821209] ata4.00: status: { DRDY ERR } Jul 26 09:52:41 donbot kernel: [164377.821217] ata4.00: error: { UNC } Jul 26 09:52:42 donbot kernel: [164378.805382] ata4.00: configured for UDMA/33 Jul 26 09:52:42 donbot kernel: [164378.805412] ata4: EH complete Jul 26 09:52:46 donbot kernel: [164382.020191] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:52:46 donbot kernel: [164382.020202] ata4.00: irq_stat 0x40000001 Jul 26 09:52:46 donbot kernel: [164382.020211] ata4.00: failed command: READ DMA EXT Jul 26 09:52:46 donbot kernel: [164382.020227] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:52:46 donbot kernel: [164382.020231] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:52:46 donbot kernel: [164382.020240] ata4.00: status: { DRDY ERR } Jul 26 09:52:46 donbot kernel: [164382.020246] ata4.00: error: { UNC } Jul 26 09:52:47 donbot kernel: [164383.788504] ata4.00: configured for UDMA/33 Jul 26 09:52:47 donbot kernel: [164383.788533] ata4: EH complete Jul 26 09:52:51 donbot kernel: [164387.012060] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:52:51 donbot kernel: [164387.012071] ata4.00: irq_stat 0x40000001 Jul 26 09:52:51 donbot kernel: [164387.012079] ata4.00: failed command: READ DMA EXT Jul 26 09:52:51 donbot kernel: [164387.012096] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:52:51 donbot kernel: [164387.012099] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:52:51 donbot kernel: [164387.012108] ata4.00: status: { DRDY ERR } Jul 26 09:52:51 donbot kernel: [164387.012113] ata4.00: error: { UNC } Jul 26 09:52:52 donbot kernel: [164388.501109] ata4.00: configured for UDMA/33 Jul 26 09:52:52 donbot kernel: [164388.501179] ata4: EH complete Jul 26 09:52:55 donbot kernel: [164391.715647] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:52:55 donbot kernel: [164391.715658] ata4.00: irq_stat 0x40000001 Jul 26 09:52:55 donbot kernel: [164391.715667] ata4.00: failed command: READ DMA EXT Jul 26 09:52:55 donbot kernel: [164391.715683] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:52:55 donbot kernel: [164391.715687] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:52:55 donbot kernel: [164391.715696] ata4.00: status: { DRDY ERR } Jul 26 09:52:55 donbot kernel: [164391.715701] ata4.00: error: { UNC } Jul 26 09:52:57 donbot kernel: [164393.483906] ata4.00: configured for UDMA/33 Jul 26 09:52:57 donbot kernel: [164393.483934] ata4: EH complete Jul 26 09:53:00 donbot kernel: [164396.707521] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:00 donbot kernel: [164396.707532] ata4.00: irq_stat 0x40000001 Jul 26 09:53:00 donbot kernel: [164396.707540] ata4.00: failed command: READ DMA EXT Jul 26 09:53:00 donbot kernel: [164396.707557] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:00 donbot kernel: [164396.707561] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:00 donbot kernel: [164396.707569] ata4.00: status: { DRDY ERR } Jul 26 09:53:00 donbot kernel: [164396.707575] ata4.00: error: { UNC } Jul 26 09:53:02 donbot kernel: [164398.196929] ata4.00: configured for UDMA/33 Jul 26 09:53:02 donbot kernel: [164398.196958] ata4: EH complete Jul 26 09:53:07 donbot kernel: [164403.175830] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:07 donbot kernel: [164403.175841] ata4.00: irq_stat 0x40000001 Jul 26 09:53:07 donbot kernel: [164403.175849] ata4.00: failed command: READ DMA EXT Jul 26 09:53:07 donbot kernel: [164403.175865] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:07 donbot kernel: [164403.175869] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:07 donbot kernel: [164403.175878] ata4.00: status: { DRDY ERR } Jul 26 09:53:07 donbot kernel: [164403.175884] ata4.00: error: { UNC } Jul 26 09:53:08 donbot kernel: [164404.664905] ata4.00: configured for UDMA/33 Jul 26 09:53:08 donbot kernel: [164404.664936] sd 3:0:0:0: [sdb] Unhandled sense code Jul 26 09:53:08 donbot kernel: [164404.664942] sd 3:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jul 26 09:53:08 donbot kernel: [164404.664952] sd 3:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor] Jul 26 09:53:08 donbot kernel: [164404.664963] Descriptor sense data with sense descriptors (in hex): Jul 26 09:53:08 donbot kernel: [164404.664969] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jul 26 09:53:08 donbot kernel: [164404.664989] 4f 86 ea 48 Jul 26 09:53:08 donbot kernel: [164404.664998] sd 3:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed Jul 26 09:53:08 donbot kernel: [164404.665010] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 4f 86 ea 48 00 00 08 00 Jul 26 09:53:08 donbot kernel: [164404.665029] end_request: I/O error, dev sdb, sector 1334241864 Jul 26 09:53:08 donbot kernel: [164404.665077] ata4: EH complete Jul 26 09:53:15 donbot kernel: [164411.635082] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:15 donbot kernel: [164411.635093] ata4.00: irq_stat 0x40000001 Jul 26 09:53:15 donbot kernel: [164411.635101] ata4.00: failed command: READ DMA EXT Jul 26 09:53:15 donbot kernel: [164411.635118] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:15 donbot kernel: [164411.635122] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:15 donbot kernel: [164411.635180] ata4.00: status: { DRDY ERR } Jul 26 09:53:15 donbot kernel: [164411.635189] ata4.00: error: { UNC } Jul 26 09:53:17 donbot kernel: [164413.403374] ata4.00: configured for UDMA/33 Jul 26 09:53:17 donbot kernel: [164413.403403] ata4: EH complete Jul 26 09:53:20 donbot kernel: [164416.626957] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:20 donbot kernel: [164416.626968] ata4.00: irq_stat 0x40000001 Jul 26 09:53:20 donbot kernel: [164416.626976] ata4.00: failed command: READ DMA EXT Jul 26 09:53:20 donbot kernel: [164416.626993] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:20 donbot kernel: [164416.626997] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:20 donbot kernel: [164416.627005] ata4.00: status: { DRDY ERR } Jul 26 09:53:20 donbot kernel: [164416.627011] ata4.00: error: { UNC } Jul 26 09:53:22 donbot kernel: [164418.116573] ata4.00: configured for UDMA/33 Jul 26 09:53:22 donbot kernel: [164418.116602] ata4: EH complete Jul 26 09:53:25 donbot kernel: [164421.331545] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:25 donbot kernel: [164421.331556] ata4.00: irq_stat 0x40000001 Jul 26 09:53:25 donbot kernel: [164421.331564] ata4.00: failed command: READ DMA EXT Jul 26 09:53:25 donbot kernel: [164421.331581] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:25 donbot kernel: [164421.331585] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:25 donbot kernel: [164421.331593] ata4.00: status: { DRDY ERR } Jul 26 09:53:25 donbot kernel: [164421.331599] ata4.00: error: { UNC } Jul 26 09:53:27 donbot kernel: [164423.099826] ata4.00: configured for UDMA/33 Jul 26 09:53:27 donbot kernel: [164423.099855] ata4: EH complete Jul 26 09:53:30 donbot kernel: [164426.323397] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:30 donbot kernel: [164426.323408] ata4.00: irq_stat 0x40000001 Jul 26 09:53:30 donbot kernel: [164426.323416] ata4.00: failed command: READ DMA EXT Jul 26 09:53:30 donbot kernel: [164426.323433] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:30 donbot kernel: [164426.323437] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:30 donbot kernel: [164426.323445] ata4.00: status: { DRDY ERR } Jul 26 09:53:30 donbot kernel: [164426.323451] ata4.00: error: { UNC } Jul 26 09:53:33 donbot kernel: [164429.572948] ata4.00: configured for UDMA/33 Jul 26 09:53:33 donbot kernel: [164429.572977] ata4: EH complete Jul 26 09:53:36 donbot kernel: [164432.790722] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:36 donbot kernel: [164432.790733] ata4.00: irq_stat 0x40000001 Jul 26 09:53:36 donbot kernel: [164432.790741] ata4.00: failed command: READ DMA EXT Jul 26 09:53:36 donbot kernel: [164432.790758] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:36 donbot kernel: [164432.790762] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:36 donbot kernel: [164432.790771] ata4.00: status: { DRDY ERR } Jul 26 09:53:36 donbot kernel: [164432.790777] ata4.00: error: { UNC } Jul 26 09:53:40 donbot kernel: [164436.041375] ata4.00: configured for UDMA/33 Jul 26 09:53:40 donbot kernel: [164436.041404] ata4: EH complete Jul 26 09:53:43 donbot kernel: [164439.259033] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Jul 26 09:53:43 donbot kernel: [164439.259044] ata4.00: irq_stat 0x40000001 Jul 26 09:53:43 donbot kernel: [164439.259053] ata4.00: failed command: READ DMA EXT Jul 26 09:53:43 donbot kernel: [164439.259070] ata4.00: cmd 25/00:08:48:ea:86/00:00:4f:00:00/e0 tag 0 dma 4096 in Jul 26 09:53:43 donbot kernel: [164439.259074] res 51/40:08:48:ea:86/00:00:4f:00:00/e0 Emask 0x9 (media error) Jul 26 09:53:43 donbot kernel: [164439.259082] ata4.00: status: { DRDY ERR } Jul 26 09:53:43 donbot kernel: [164439.259088] ata4.00: error: { UNC } Jul 26 09:53:44 donbot kernel: [164440.244180] ata4.00: configured for UDMA/33 Jul 26 09:53:44 donbot kernel: [164440.244210] sd 3:0:0:0: [sdb] Unhandled sense code Jul 26 09:53:44 donbot kernel: [164440.244216] sd 3:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Jul 26 09:53:44 donbot kernel: [164440.244225] sd 3:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor] Jul 26 09:53:44 donbot kernel: [164440.244236] Descriptor sense data with sense descriptors (in hex): Jul 26 09:53:44 donbot kernel: [164440.244242] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 Jul 26 09:53:44 donbot kernel: [164440.244263] 4f 86 ea 48 Jul 26 09:53:44 donbot kernel: [164440.244272] sd 3:0:0:0: [sdb] Add. Sense: Unrecovered read error - auto reallocate failed Jul 26 09:53:44 donbot kernel: [164440.244284] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 4f 86 ea 48 00 00 08 00 Jul 26 09:53:44 donbot kernel: [164440.244303] end_request: I/O error, dev sdb, sector 1334241864 Jul 26 09:53:44 donbot kernel: [164440.244356] ata4: EH complete
It keeps mention 'sdb' but I'm not sure off the top of my head which drive that is. How can I check that via the terminal (I only have SSH access at the moment)? The LZ is a separate partition on my system drive...which hasn't given any other indication of any problems.

Who is online

Users browsing this forum: No registered users and 5 guests