Hey there - if this is the wrong place to post this, I'll happily post elsewhere. I looked for a troubleshooting forum, but couldn't find one.
I have been running my HDA for months now, and it's been awesome. I added a new drive last night (after having added a few, so not a new thing), and since then, my HDA has been having issues. Every couple of hours it goes down, causing my whole network to lose internet connectivity (as it handles my DHCP and DNS), and all of the fileshares are gone. The server itself appears to be running (all fans and lights are still running), and I can pin it, but I can't reach it via SSH. I have to hard power off the server and turn it back on to get functionality back.
The only changes since yesterday when it was working was adding the HDA, during which I updated gparted. I removed the drive in case it was causing the issue, but the problem is still occurring.
I'm still not a Linux pro, so I have no idea where to even begin with troubleshooting. Any recommendations or troubleshooting steps would be awesome.
Thanks. Any help is appreciated.
Having trouble with Amahi after 5 months of operation
-
- Posts: 36
- Joined: Sun Feb 23, 2014 8:40 pm
Re: Having trouble with Amahi after 5 months of operation
Check /var/log for log files that might offer some insight. How did you remove the drive, physically or just from /etc/fstab?
ßîgƒσστ65
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
-
- Posts: 36
- Joined: Sun Feb 23, 2014 8:40 pm
Re: Having trouble with Amahi after 5 months of operation
When I took it back out, I removed it from fstab, the greyhole pool, and physically.
However, I did find the error in question - it appears one of the other drives has a bad sector that is causing everything to grind to a halt when it's hit. I had to reconnect a monitor and everything - the error only showed up on the local terminal, not in SSH.
I restarted the server, commented out the bad drive in fstab and greyhole, brought back the new drive in fstab and greyhole, physically reconnected the new drive and disconnected the old one, and then i forced fsck on greyhole to get all of the files restored from their backups. Luckily, I had all shares set to create at least one backup copy in greyhole.conf. It's a lot of data, so it's taking a while. Hopefully everything restores successfully.
is that the right way to handle removal of a failing drive?
Thanks.
However, I did find the error in question - it appears one of the other drives has a bad sector that is causing everything to grind to a halt when it's hit. I had to reconnect a monitor and everything - the error only showed up on the local terminal, not in SSH.
I restarted the server, commented out the bad drive in fstab and greyhole, brought back the new drive in fstab and greyhole, physically reconnected the new drive and disconnected the old one, and then i forced fsck on greyhole to get all of the files restored from their backups. Luckily, I had all shares set to create at least one backup copy in greyhole.conf. It's a lot of data, so it's taking a while. Hopefully everything restores successfully.
is that the right way to handle removal of a failing drive?
Thanks.
Re: Having trouble with Amahi after 5 months of operation
For the most part, that sounds about right.
You also have to tell Greyhole a drive is being removed so it can update the database and ensure a copy of any files on the drive are available elsewhere. Did you do that as well?
You also have to tell Greyhole a drive is being removed so it can update the database and ensure a copy of any files on the drive are available elsewhere. Did you do that as well?
ßîgƒσστ65
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
-
- Posts: 36
- Joined: Sun Feb 23, 2014 8:40 pm
Re: Having trouble with Amahi after 5 months of operation
after removing the drive from the config files and everything and restarting, i ran sudo greyhole --fsck, which, when viewing the logs, indicates that it's looking N copies of a given file, finding N-1, and copying the one to another location and updating the symlinks. I'm seeing files reappear, so this seems to be doing what you said. Is there a better/safer way to do this?
Re: Having trouble with Amahi after 5 months of operation
When removing a drive from the pool, it's best to do:
Then when you add the new drive as normal.
Code: Select all
greyhole --going=/var/hda/files/drives/drive#/gh
ßîgƒσστ65
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
-
- Posts: 36
- Joined: Sun Feb 23, 2014 8:40 pm
Re: Having trouble with Amahi after 5 months of operation
Good to know. Should I run that command after this job is done? Should I stop the current process to run this?
Thanks for the help!
Thanks for the help!
Re: Having trouble with Amahi after 5 months of operation
No, it should be ok.
ßîgƒσστ65
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
Applications Manager
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2
Who is online
Users browsing this forum: No registered users and 101 guests