Server becomes unresponsive and requires hard reboot

dcshoes23
Posts: 5
Joined: Tue Mar 04, 2014 1:20 pm

Server becomes unresponsive and requires hard reboot

Postby dcshoes23 » Mon Jun 02, 2014 9:48 am

Hi,

My server has been acting up for the last few months. Sometimes it will stop working 2-3 times in a day, other times it is fine for about a week. Once it is unresponsive, I have to hold the power button down to turn it off and restart. I cannot access it via ssh or any web apps.

When I look at /var/log/messages I see dumps similar to this a lot:

Code: Select all

Jun 2 05:27:15 localhost kernel: [59653.018265] general protection fault: 0000 [#4791] SMP Jun 2 05:27:15 localhost kernel: [59653.018273] Modules linked in: arc4 md4 nls_utf8 tun cifs dns_resolver fscache nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm coretemp kvm_intel kvm r8169 iTCO_wdt iTCO_vendor_support mii microcode ppdev i2c_i801 i2c_core gpio_ich serio_raw snd_page_alloc snd_timer snd soundcore parport_pc parport shpchp lpc_ich mfd_core acpi_cpufreq ata_generic pata_acpi pata_jmicron Jun 2 05:27:15 localhost kernel: [59653.018307] CPU: 1 PID: 12185 Comm: smbd Tainted: G B D 3.13.11-100.fc19.x86_64 #1 Jun 2 05:27:15 localhost kernel: [59653.018311] Hardware name: Gigabyte Technology Co., Ltd. EP45T-UD3LR/EP45T-UD3LR, BIOS F12e 10/14/2011 Jun 2 05:27:15 localhost kernel: [59653.018315] task: ffff88011f34c500 ti: ffff88000fbb6000 task.ti: ffff88000fbb6000 Jun 2 05:27:15 localhost kernel: [59653.018318] RIP: 0010:[<ffffffff8114a542>] [<ffffffff8114a542>] find_get_page+0x42/0xc0 Jun 2 05:27:15 localhost kernel: [59653.018326] RSP: 0018:ffff88000fbb7b78 EFLAGS: 00010246 Jun 2 05:27:15 localhost kernel: [59653.018328] RAX: 0000000080000000 RBX: ffff88000faa2bf8 RCX: 00000000fffffffa Jun 2 05:27:15 localhost kernel: [59653.018331] RDX: 0400000000000000 RSI: ffff8800a71eb518 RDI: 0000000000000000 Jun 2 05:27:15 localhost kernel: [59653.018334] RBP: ffff88000fbb7b88 R08: 0400000000000000 R09: ffff8800a71eb308 Jun 2 05:27:15 localhost kernel: [59653.018336] R10: 0000000000000041 R11: ffffea0007eef3c0 R12: 000000000006e67f Jun 2 05:27:15 localhost kernel: [59653.018339] R13: ffff88000faa2bf0 R14: 000000000006e67f R15: 00000000000000d0 Jun 2 05:27:15 localhost kernel: [59653.018342] FS: 00007f99d51fd840(0000) GS:ffff880207c80000(0000) knlGS:0000000000000000 Jun 2 05:27:15 localhost kernel: [59653.018345] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 2 05:27:15 localhost kernel: [59653.018348] CR2: 00007f0506d5eb20 CR3: 00000000abd71000 CR4: 00000000000007e0 Jun 2 05:27:15 localhost kernel: [59653.018351] Stack: Jun 2 05:27:15 localhost kernel: [59653.018352] ffff88011e3b7900 000000000006e67f ffff88000fbb7bb0 ffffffff8114a77f Jun 2 05:27:15 localhost kernel: [59653.018357] ffff88011e3b7900 ffff88000faa2bf0 00000000010200da ffff88000fbb7bf0 Jun 2 05:27:15 localhost kernel: [59653.018362] ffffffff8114b12f 000000006e67f000 ffff88011e3b7900 ffff88000faa2aa0 Jun 2 05:27:15 localhost kernel: [59653.018366] Call Trace: Jun 2 05:27:15 localhost kernel: [59653.018370] [<ffffffff8114a77f>] find_lock_page+0x1f/0x70 Jun 2 05:27:15 localhost kernel: [59653.018374] [<ffffffff8114b12f>] grab_cache_page_write_begin+0x5f/0xd0 Jun 2 05:27:15 localhost kernel: [59653.018378] [<ffffffff812433e4>] ext4_da_write_begin+0x94/0x2e0 Jun 2 05:27:15 localhost kernel: [59653.018382] [<ffffffff81243eca>] ? ext4_da_write_end+0xba/0x250 Jun 2 05:27:15 localhost kernel: [59653.018385] [<ffffffff8114a3a8>] generic_file_buffered_write+0xf8/0x250 Jun 2 05:27:15 localhost kernel: [59653.018389] [<ffffffff8114bb51>] __generic_file_aio_write+0x1c1/0x3d0 Jun 2 05:27:15 localhost kernel: [59653.018392] [<ffffffff8114bdb8>] generic_file_aio_write+0x58/0xa0 Jun 2 05:27:15 localhost kernel: [59653.018397] [<ffffffff81239519>] ext4_file_write+0x99/0x400 Jun 2 05:27:15 localhost kernel: [59653.018401] [<ffffffff812061a4>] ? posix_test_lock+0x24/0xf0 Jun 2 05:27:15 localhost kernel: [59653.018404] [<ffffffff8120629d>] ? vfs_test_lock+0x2d/0x40 Jun 2 05:27:15 localhost kernel: [59653.018408] [<ffffffff81207d81>] ? fcntl_getlk+0xf1/0x110 Jun 2 05:27:15 localhost kernel: [59653.018412] [<ffffffff811b7c9a>] do_sync_write+0x5a/0x90 Jun 2 05:27:15 localhost kernel: [59653.018416] [<ffffffff811b83f4>] vfs_write+0xb4/0x1f0 Jun 2 05:27:15 localhost kernel: [59653.018419] [<ffffffff811b8fa2>] SyS_pwrite64+0x72/0xb0 Jun 2 05:27:15 localhost kernel: [59653.018423] [<ffffffff81690729>] system_call_fastpath+0x16/0x1b Jun 2 05:27:15 localhost kernel: [59653.018426] Code: 89 df e8 52 dd 1c 00 48 85 c0 48 89 c6 74 52 48 8b 10 48 85 d2 74 3d f6 c2 03 75 6b 65 8b 04 25 a0 c7 00 00 a9 00 ff 1f 00 75 57 <8b> 4a 1c 85 c9 74 ca 8d 79 01 4c 8d 4a 1c 89 c8 f0 0f b1 7a 1c Jun 2 05:27:15 localhost kernel: [59653.018455] RIP [<ffffffff8114a542>] find_get_page+0x42/0xc0 Jun 2 05:27:15 localhost kernel: [59653.018459] RSP <ffff88000fbb7b78> Jun 2 05:27:15 localhost kernel: [59653.018463] ---[ end trace 9bd77a6a70cdff38 ]---
uname -a :
Linux localhost.localdomain 3.14.4-100.fc19.x86_64 #1 SMP Tue May 13 15:00:26 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I have the latest updates via yum update and am using greyhole for my storage pool.

It seems to be an issue with smbd (or maybe file system related?). I am no Linux expert. Where should I go from here?

Thanks,
Matthew

User avatar
bigfoot65
Project Manager
Posts: 11924
Joined: Mon May 25, 2009 4:31 pm

Re: Server becomes unresponsive and requires hard reboot

Postby bigfoot65 » Mon Jun 02, 2014 10:08 am

Looks like some type of kernel issue possibly. The hard shut down doesn't help either. It could be a hardware issue, drive other mother board problems.

You might try using a Live CD on the machine and do some checking of drives, hardware, etc.
ßîgƒσστ65
Applications Manager

My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 16GB RAM, 1TBx1+2TBx2+4TBx2

Who is online

Users browsing this forum: No registered users and 43 guests