Random server freeze

stueyboy
Posts: 101
Joined: Tue Mar 09, 2010 2:54 am

Random server freeze

Postby stueyboy » Sat Jul 24, 2010 1:29 pm

Looking for some help.

I am having some issues with my system randomly crashing and I thought this was due to a bad disk sector. I have deleted the partitions and reformatted followed by a complete system rebuild and I am still getting some sort of random crash.

Anyone have any clues where to start diagnosing what is going on. I suspect a hard disk hardware error but if there is some new update which might be causing this it would be good to rule that out first.

Ta

User avatar
gboudreau
Posts: 606
Joined: Sat Jan 23, 2010 1:15 pm
Location: Montréal, Canada
Contact:

Re: Random server freeze

Postby gboudreau » Sat Jul 24, 2010 3:27 pm

When it freezes next time, note down the exact time, and reboot, then immediately look in /var/log/messages
Check for any messages found before your reboot.
- Guillaume Boudreau

stueyboy
Posts: 101
Joined: Tue Mar 09, 2010 2:54 am

Re: Random server freeze

Postby stueyboy » Sun Jul 25, 2010 1:25 am

Thanks for that. I'll have a look next time

User avatar
moredruid
Expert
Posts: 791
Joined: Tue Jan 20, 2009 1:33 am
Location: Netherlands
Contact:

Re: Random server freeze

Postby moredruid » Sun Jul 25, 2010 4:48 am

dmesg is the hardware log (it's a command), you might see some interesting stuff there (memory errors, disk sector errors) even now.

smartctl -a /dev/<suspect harddisk> might give you a clue too if it has errors, you can also run tests with that command.

usually it's corrupt memory.

most distro disks incorporate a memtest, you might want to run that if you're HDD is OK
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D2173656C7572206968616D41snlbxq' | dc
Galileo - HP Proliant ML110 G6 quad core Xeon 2.4GHz, 4GB RAM, 2x750GB RAID1 + 2x1TB RAID1 HDD

stueyboy
Posts: 101
Joined: Tue Mar 09, 2010 2:54 am

Re: Random server freeze

Postby stueyboy » Sun Jul 25, 2010 11:48 am

Thanks guys for the help. At the moment, all is well with the new install. I have stopped any updates at all from the Fedora system but I think I might have had a corrupt partition. My first attempt at a clean reinstall over the top of the old one failed, so I booted using an Ubuntu live disk and wiped the partition tables and created new ones then installed again and all going OK so far.

The old install was giving me all sorts of disk errors on the F2 setup screens. We did get the electricity switched off while I was at work a few weeks ago and I forgot to switch the system off during that time so I suspect that might have contributed to the unstable system.

stueyboy
Posts: 101
Joined: Tue Mar 09, 2010 2:54 am

Re: Random server freeze

Postby stueyboy » Mon Aug 09, 2010 1:01 am

Bit of an update.....


...Instabilities in the system continued after my rebuild to such an extent that yesterday upon a freeze and reboot, the BIOS didn't recognise the disks at all. Swapped the HD cable out and it all seems to be much better now. Gave me a chance to set up greyhole as well which seems to be working nicely.

oldcyberdude
Posts: 25
Joined: Thu Jan 22, 2009 6:15 am

Re: Random server freeze

Postby oldcyberdude » Wed Aug 11, 2010 11:23 am

I also have been getting random freezes. First evidence is usually DNS fails on my network systems.

I recently totally changed my hardware (except for the RAID disks containing hda) thinking it was in the old system hardware but my new system is showing the same symtomsdmesg. Below are last entires from messages prior to freeze. All previous messages (all morning) are quite similar i.e. DHCP At 13:43 I start seeing my reboot messages.

Aug 11 13:23:54 tigger nmbd[2146]: [2010/08/11 13:23:54, 0] nmbd/nmbd_browsesync.c:350(find_domain_master_name_query_fail)
Aug 11 13:23:54 tigger nmbd[2146]: find_domain_master_name_query_fail:
Aug 11 13:23:54 tigger nmbd[2146]: Unable to find the Domain Master Browser name HOME<1b> for the workgroup HOME.
Aug 11 13:23:54 tigger nmbd[2146]: Unable to sync browse lists in this workgroup.
Aug 11 13:29:50 tigger dhcpd: Wrote 0 deleted host decls to leases file.
Aug 11 13:29:50 tigger dhcpd: Wrote 0 new dynamic host decls to leases file.
Aug 11 13:29:50 tigger dhcpd: Wrote 8 leases to leases file.
Aug 11 13:29:50 tigger dhcpd: DHCPREQUEST for 192.168.1.103 from 00:1f:bc:03:62:05 (kanga) via eth0
Aug 11 13:29:50 tigger dhcpd: DHCPACK on 192.168.1.103 to 00:1f:bc:03:62:05 (kanga) via eth0

dmesg doesn't seem to report anything unusual

nothing from smartctl

I really don't think memtest would show anything since memory was changed with the system.

Tough one to trouble shoot since the system runs for hours and sometimes days

gjc1000
Pro User
Pro User
Posts: 133
Joined: Sat Jan 03, 2009 8:30 am

Re: Random server freeze

Postby gjc1000 » Thu Aug 12, 2010 9:36 pm

I had a problem like that once. I had just built a system with all new parts and could not figure out why it was freezing randomly, figuring it has to be a software issue, because in the back of my mind, I kept saying to myself, it's new hardware, that can't be what's wrong, alas, as I was troubleshooting the hardware side, I run a memtest, and sure enough, I had a bad memory stick. Swapped the bad stick of memory with a new one, and viola, everything was as it should be and continues to be.
Try memtest, just for giggles : )
gjc1000
Chi pecora si fa, il lupo se la mangia.

Who is online

Users browsing this forum: No registered users and 3 guests