Periodic loss of DCHP

andyser
Posts: 10
Joined: Thu Feb 05, 2009 6:53 am

Periodic loss of DCHP

Postby andyser » Thu Feb 05, 2009 8:12 am

I've been running Amahi for about 3 months and overall have been very happy with the application. The one issue I've had is that I periodically cannot connect to the Internet from my networked machines. On my XP machines the network icon says "acquiring IP address" (but never does) and trying to repair the connection doesn't work either.

If I simply reboot the Amahi server then I'm back in business. I haven't been able to tie the DCHP loss to any particular cause (sometimes it happens in the morning, sometimes evening) and since it's only happened 4 times in the last 3 months diagnosis has been difficult.

So, my first question is what log file can I look at (or enable) to try and determine what's happening?

Second, maybe this related, when I go to the Control Panel tab on Amahi.org my system status shows "Not updating".

Thanks!

User avatar
cpg
Administrator
Posts: 2618
Joined: Wed Dec 03, 2008 7:40 am
Contact:

Re: Periodic loss of DCHP

Postby cpg » Thu Feb 05, 2009 8:45 am

Hmmm.

Amahi is designed for stability, so this is not supposed to happen.

At the time this happens, you could check /var/log/messages and search for any unusual events before or around the time you first started noticing.

The control panel at amahi.org reflects whether the dynamic dns client is updating or it's stopped. If your HDA is on, it should indicate it's "running" else, if there are any connectivity issues, then it may show "stopped."

This is unrelated to the internal behavior in the network as it relates to DHCP.

Is it possible that the client cannot connect occasionally?

Say, say for instance because of weak wifi signal or something like that, like the machine somehow hopping on to another more distant wifi access point?

Other thoughts ...

- Have you tried rebooting the client only, not your HDA?
- Are other machines in the network still working and renewing their leases at the time this happens?
- Is your NIC very old or very new? (i.e. if it's related to your HDA, do you think the driver could be too old or too new?)

Glad to hear is has otherwise been working to your satisfaction! :)
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 8GB RAM, 1TBx2+3TBx1

andyser
Posts: 10
Joined: Thu Feb 05, 2009 6:53 am

Re: Periodic loss of DCHP

Postby andyser » Thu Feb 05, 2009 10:40 am

cpg,

Thanks for the quick feedback.

The client that I usually first notice having connection problems (just because of usage) is our main home PC which is hardwired. I've tried rebooting after trying the "IP repair" with no luck. I also have the kids PC in the same room (also hardwired) and it will have the same connectivity issues. And yes, I have fired up the wireless laptop just to try another "route" into the router, but again the same connectivity issues, hence my feeling that the DHCP in the server may be the issue, since rebooting the server gets everybody back up and running.

I'll take a look at /var/log/messages this evening and let you know if anything stands out. I think all the NIC drivers are up to date, but I'll double check.

Finally, even with this little glitch my hat's off to the Amahi development team. This is a well thought out, easy to install and easy to manage home networking solution. I ran ClarkConnect for several months on another machine and then used Ubuntuas a sever for awhile since I wanted to also use the machine for basic Internet access. However, when I changed hardware I was looking for a more straight forawrd, easy to manage option and Amahi fit the bill - nice work.

User avatar
moredruid
Expert
Posts: 791
Joined: Tue Jan 20, 2009 1:33 am
Location: Netherlands
Contact:

Re: Periodic loss of DCHP

Postby moredruid » Thu Feb 05, 2009 11:29 am

hmm, what kind of OS are the clients running?

I've had major headaches with a Windows Vista client with DHCP. Once the machine went into sleep/standby, the lease wouldn't come back (automatic repair, manual ipconfig /renew tweaking the drivers, changing power management for the NIC etc. all didn't work). I got so fed up with it that I installed Windows 7 lately, and this finally works like it's supposed to. I'm pretty sure it was a Windows Vista issue since my girlfriends' iMac and my dual boot WinXP/debian box worked without any issues right from the start.
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D2173656C7572206968616D41snlbxq' | dc
Galileo - HP Proliant ML110 G6 quad core Xeon 2.4GHz, 4GB RAM, 2x750GB RAID1 + 2x1TB RAID1 HDD

User avatar
cpg
Administrator
Posts: 2618
Joined: Wed Dec 03, 2008 7:40 am
Contact:

Re: Periodic loss of DCHP

Postby cpg » Thu Feb 05, 2009 1:04 pm

... main home PC which is hardwired. I've tried rebooting after trying the "IP repair" with no luck. I also have the kids PC in the same room (also hardwired) and it will have the same connectivity issues. And yes, I have fired up the wireless laptop just to try another "route" into the router, but again the same connectivity issues, hence my feeling that the DHCP in the server may be the issue, since rebooting the server gets everybody back up and running.
that's pretty compelling evidence that the amahi server is the culprit.

hope the /var/log/messages sheds some light into the issue!
Finally, even with this little glitch my hat's off to the Amahi development team. This is a well thought out, easy to install and easy to manage home networking solution. I ran ClarkConnect for several months on another machine and then used Ubuntuas a sever for awhile since I wanted to also use the machine for basic Internet access. However, when I changed hardware I was looking for a more straight forward, easy to manage option and Amahi fit the bill - nice work.
thanks for my part! it's great to see amahi get so many contributions and ideas from the team and the user community!
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 8GB RAM, 1TBx2+3TBx1

andyser
Posts: 10
Joined: Thu Feb 05, 2009 6:53 am

Re: Periodic loss of DCHP

Postby andyser » Sat Feb 07, 2009 2:04 pm

OK, the /var/log/messages file has some interesting entries. Let me start with with two of them:

1) The entry below seems to what started the DCHP problem. At 1:58AM there was the GLib-GObeject... warning then all entries stop at 2:01. The other entries between 1:58 and 2:01 look like the server may have tried to reboot, so at first I thought a power failure, but there was no other indication of a power glitch (flashing clocks, etc).

Feb 5 01:58:52 localhost console-kit-daemon[2010]: GLib-GObject-WARNING: IA__g_object_get_valist: value location for `gchararray' passed as NULL
Feb 5 01:58:53 localhost init: tty4 main process (2705) killed by TERM signal
Feb 5 01:58:53 localhost init: tty5 main process (2706) killed by TERM signal
Feb 5 01:58:53 localhost init: tty2 main process (2707) killed by TERM signal
Feb 5 01:58:53 localhost init: tty3 main process (2708) killed by TERM signal
Feb 5 01:58:53 localhost init: tty1 main process (2709) killed by TERM signal
Feb 5 01:58:53 localhost init: tty6 main process (2710) killed by TERM signal
Feb 5 01:58:53 localhost gconfd (gdm-2826): Exiting
Feb 5 01:58:54 localhost avahi-daemon[2624]: Got SIGTERM, quitting.

.
.
2) The other entry I wanted to ask about was the reoccurring entry below. Every 30 seconds 'mt-daapd' tries to start and fails. It looks like it related to the iTunes server, but I never installed that app.

Feb 5 01:50:14 localhost monit[2663]: 'mt-daapd' start: /etc/init.d/mt-daapd
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' failed to start
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' process is not running
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' trying to restart

I received your other message, so I'll the get the /log/message file to you.

Thanks

User avatar
cpg
Administrator
Posts: 2618
Joined: Wed Dec 03, 2008 7:40 am
Contact:

Re: Periodic loss of DCHP

Postby cpg » Sat Feb 07, 2009 7:45 pm

OK, the /var/log/messages file has some interesting entries. Let me start with with two of them:

1) The entry below seems to what started the DCHP problem. At 1:58AM there was the GLib-GObeject... warning then all entries stop at 2:01. The other entries between 1:58 and 2:01 look like the server may have tried to reboot, so at first I thought a power failure, but there was no other indication of a power glitch (flashing clocks, etc).

Feb 5 01:58:52 localhost console-kit-daemon[2010]: GLib-GObject-WARNING: IA__g_object_get_valist: value location for `gchararray' passed as NULL
Feb 5 01:58:53 localhost init: tty4 main process (2705) killed by TERM signal
Feb 5 01:58:53 localhost init: tty5 main process (2706) killed by TERM signal
Feb 5 01:58:53 localhost init: tty2 main process (2707) killed by TERM signal
Feb 5 01:58:53 localhost init: tty3 main process (2708) killed by TERM signal
Feb 5 01:58:53 localhost init: tty1 main process (2709) killed by TERM signal
Feb 5 01:58:53 localhost init: tty6 main process (2710) killed by TERM signal
Feb 5 01:58:53 localhost gconfd (gdm-2826): Exiting
Feb 5 01:58:54 localhost avahi-daemon[2624]: Got SIGTERM, quitting.

.
yes! definitely something initiated some sort of shutdown or something at 1.58.

the line

Code: Select all

99. Feb 5 02:00:18 localhost kernel: imklog 3.20.2, log source = /proc/kmsg started.
indicates that it was actually a reboot. the system restarted around that time. hard to tell why.
looking before 1.58 may help.

is there a UPS or the system is running on battery, or something that would cause it to shut down then reboot?

at 02:00:42, the dhcp daemon came up fine.

in the last two lines at 2am ( 02:01:23), notice how it appears that something is shutting down again!!

are there any entry with DHCPREQUESTs or DHCPACKs after 2:01?

if you see lines like this:

Code: Select all

kernel: Linux version 2.6.27.9-73.fc9.i686
that means the machine was booting.

is there a chance that some object is leaning on the power button or something??
it would seem like it :)
2) The other entry I wanted to ask about was the reoccurring entry below. Every 30 seconds 'mt-daapd' tries to start and fails. It looks like it related to the iTunes server, but I never installed that app.

Feb 5 01:50:14 localhost monit[2663]: 'mt-daapd' start: /etc/init.d/mt-daapd
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' failed to start
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' process is not running
Feb 5 01:50:44 localhost monit[2663]: 'mt-daapd' trying to restart

I received your other message, so I'll the get the /log/message file to you.

Thanks
this is a (harmless) race condition.

http://amahi.lighthouseapp.com/projects ... g-pid-file
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 8GB RAM, 1TBx2+3TBx1

User avatar
moredruid
Expert
Posts: 791
Joined: Tue Jan 20, 2009 1:33 am
Location: Netherlands
Contact:

Re: Periodic loss of DCHP

Postby moredruid » Sun Feb 08, 2009 1:53 am

the problem with these kind of issues is that it's sometimes hard to pinpoint the exact issue.

that said, you may want to check your hardware logs:
root@localhost# dmesg

Besides that I've always got the sysstat package installed, this is a logging facility that gives you load information. you can read the sysstat generated files with "sar", you may have experienced a very high load in which the kernel will semi-randomly kill processes (processes with the most CPU taken will usually be shot down first). From what I've seen in the logs you posted it's that the getty processes (login facility) and your gdm were shut down.
what also may happen in very high load is that the syslogd is killed. usually it will be restarted by xinetd as soon as there are enough CPU resources available, this can also cause a lapse in your logs.

the last thing you should check is if there are a lot of processes running, this can also indicate something is out of the ordinary.
root@localhost# ps -ef | grep DEFUNCT

will show you all the defunct (zombie) processes, if there are a lot of these you should check them out to see what's causing it.
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D2173656C7572206968616D41snlbxq' | dc
Galileo - HP Proliant ML110 G6 quad core Xeon 2.4GHz, 4GB RAM, 2x750GB RAID1 + 2x1TB RAID1 HDD

andyser
Posts: 10
Joined: Thu Feb 05, 2009 6:53 am

Re: Periodic loss of DCHP

Postby andyser » Mon Feb 09, 2009 4:55 pm

Thanks for all the feedback. My system has been running fine the past few days, so with this feedback I'll be better able to capture info and document if something happens again.

cpg - to address your couple questions: 1) No, this system isn't on a UPS and it's actually under a work bench, so it's protected (nothing near the power switch). 2) At 2:01 the /log/message file just stopped and since the "racing mt-daapd" entries also stopped I think the machine just froze up. The next entries started at 9:04 when the machine was rebooted.

Lastly, you mentioned that the 'mt-daapd' process failing entries were harmless, but they are taking up a lot of space in the /message file. The /message files get to about 7Mb and then start new, so I have several of these files piling up. Is there a simple way to stop the process from "racing"? Not a huge deal, but it would make it easier to scan the /message file if it were a tad smaller ;)

Thanks

User avatar
cpg
Administrator
Posts: 2618
Joined: Wed Dec 03, 2008 7:40 am
Contact:

Re: Periodic loss of DCHP

Postby cpg » Mon Feb 09, 2009 5:05 pm

2) At 2:01 the /log/message file just stopped and since the "racing mt-daapd" entries also stopped I think the machine just froze up. The next entries started at 9:04 when the machine was rebooted.
i think something actually stopped the machine. these processes don't get sent the TERM signal for no reason.
Lastly, you mentioned that the 'mt-daapd' process failing entries were harmless, but they are taking up a lot of space in the /message file. The /message files get to about 7Mb and then start new, so I have several of these files piling up. Is there a simple way to stop the process from "racing"? Not a huge deal, but it would make it easier to scan the /message file if it were a tad smaller ;)
if you do this:

Code: Select all

killall mt-daapd
it should let monit restart it on its own and it will hopefully take care of it by properly setting the pid file.
My HDA: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz on MSI board, 8GB RAM, 1TBx2+3TBx1

Who is online

Users browsing this forum: No registered users and 63 guests