Frustrating networking problem

doogie
Posts: 9
Joined: Thu Mar 03, 2011 9:13 am

Frustrating networking problem

Postby doogie » Fri Nov 04, 2011 12:54 am

Recently I've been getting intermittent serious networking problems with my HDA.
I'm normally alerted to it because my DNS stops working, so browsing from my phone in bed doesn't work - I'm pretty sure it's always been first thing in the morning. These happen once every week or two weeks. I don't remember them happening before I installed my IBM M1015 Raid card a couple of months back.
Normally when this happens my HDA is unpingable and the only way to restore net access is to do one of the following:-
  • Set my laptop to use the router for DNS (gives me internet but can't access HDA)
  • Reset the router (this worked once, but it's a Billion BiPAC 7800N that I've not had many problems with and it's known for it's stability)
  • Power cycle the HDA (always works but I'm not keen to do this)
My HDA is normally headless but I have connected a monitor to it and run the network troubleshooting which is posted at pastebin - it fails at step 4.
I have restarted named, but that didn't make any difference.

I'm still running the F12 version of Amahi - I'm not planning on upgrading to F14 until this is either pretty stable or someone says it's an obvious F12 problem.

Any help much appreciated!

bgrablin
Posts: 9
Joined: Thu Oct 13, 2011 9:29 pm

Re: Frustrating networking problem

Postby bgrablin » Fri Nov 04, 2011 1:15 am

What is the output of the following:

Code: Select all

cat /var/log/secure

Code: Select all

cat /var/log/messages
(Only need the last few lines prior to the point of failure)

User avatar
moredruid
Expert
Posts: 791
Joined: Tue Jan 20, 2009 1:33 am
Location: Netherlands
Contact:

Re: Frustrating networking problem

Postby moredruid » Fri Nov 04, 2011 1:18 am

what happens if you run "service network restart" as root?

BTW: can you post the output of "dmesg" as well?
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D2173656C7572206968616D41snlbxq' | dc
Galileo - HP Proliant ML110 G6 quad core Xeon 2.4GHz, 4GB RAM, 2x750GB RAID1 + 2x1TB RAID1 HDD

doogie
Posts: 9
Joined: Thu Mar 03, 2011 9:13 am

Re: Frustrating networking problem

Postby doogie » Fri Nov 04, 2011 6:52 am

Strangely enough, the ssh I had open from the office is still open, although I can't open a new one :?

/var/log/secure - nothing interesting I don't think, my username changed

Code: Select all

Nov 3 20:37:18 localhost sshd[12150]: Accepted publickey for myusername from 192.168.0.101 port 50185 ssh2 Nov 3 20:37:18 localhost sshd[12148]: Accepted publickey for myusername from 192.168.0.101 port 54499 ssh2 Nov 3 20:37:18 localhost sshd[12148]: pam_unix(sshd:session): session opened for user myusername by (uid=0) Nov 3 20:37:18 localhost sshd[12150]: pam_unix(sshd:session): session opened for user myusername by (uid=0) Nov 3 20:37:42 localhost sshd[12176]: Received disconnect from 192.168.0.101: 11: Closed due to user request. Nov 3 20:37:42 localhost sshd[12150]: pam_unix(sshd:session): session closed for user myusername Nov 3 22:36:57 localhost sshd[12148]: pam_unix(sshd:session): session closed for user myusername Nov 4 07:09:56 localhost pam: gdm-password[4376]: pam_unix(gdm-password:auth): authentication failure; logname= uid=0 euid=0 tty=:0 ruser= rhost= user=myusername Nov 4 07:10:12 localhost pam: gdm-password[6620]: pam_unix(gdm-password:session): session opened for user myusername by (uid=0) Nov 4 07:21:19 localhost su: pam_unix(su-l:session): session opened for user root by myusername(uid=500) Nov 4 09:51:43 localhost sudo: myusername : TTY=pts/0 ; PWD=/home/myusername ; USER=root ; COMMAND=/bin/su -
end of /var/log/messages (I've used the fact that transmission stopped reporting some things around 3:30 to assume it happened then) - some transmission things fudged for my protection and mac addresses and pc names too, :)
http://pastebin.com/HM8PJ4Rq

dmesg output
http://pastebin.com/PS2adk7Q

service network restart

Code: Select all

[root@server ~]# service network restart Shutting down interface eth0: [ OK ] Shutting down loopback interface: [ OK ] Disabling IPv4 packet forwarding: net.ipv4.ip_forward = 0 [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: [ OK ]
- done from the existing ssh connection, changed nothing - I expected it to drop the connection just after I hit enter.

Thanks guys

User avatar
moredruid
Expert
Posts: 791
Joined: Tue Jan 20, 2009 1:33 am
Location: Netherlands
Contact:

Re: Frustrating networking problem

Postby moredruid » Tue Nov 08, 2011 3:49 am

hmm it seems there are a hung kernel tasks. These may have to do with flush buffer actions to disk.

you might want to run (as root): smartcl -t short /dev/sd<?>
where <?> is each disk you have (ls /dev/sd* will give you the complete list).

This will run a short SMART test (replacing "short" with "long" does what you would expect :geek:)
smartctl -a /dev/sda will display the current statistics for the first disk.
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D2173656C7572206968616D41snlbxq' | dc
Galileo - HP Proliant ML110 G6 quad core Xeon 2.4GHz, 4GB RAM, 2x750GB RAID1 + 2x1TB RAID1 HDD

doogie
Posts: 9
Joined: Thu Mar 03, 2011 9:13 am

Re: Frustrating networking problem

Postby doogie » Tue Nov 08, 2011 5:57 am

hmm it seems there are a hung kernel tasks. These may have to do with flush buffer actions to disk.

you might want to run (as root): smartcl -t short /dev/sd<?>
where <?> is each disk you have (ls /dev/sd* will give you the complete list).

This will run a short SMART test (replacing "short" with "long" does what you would expect :geek:)
smartctl -a /dev/sda will display the current statistics for the first disk.
Thanks.

Nothing obvious showing up (to my eye at least) - the first 6 drives all report passes, the 7th & 8th are on the IBM RAID card that doesn't pass SMART statuses unfortunately - might have to look and see if there's a fix for that or way of enabling it.

Who is online

Users browsing this forum: No registered users and 20 guests