[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20010416: LDM failure (Connection reset by peer)




Tom,

One possible thing to check is the amount of disk space you have available
on the machine you have created your product queue on.

Prior to ldm 5.1.2, when a queue was created, it did not physically
zero out the entire memory mapped file (which is a slow process).
As a result, if you try to create a 300MB queue, and only have 200MB
available, you will not get an error. The "ls -l" will appear to
show the entire file size, but the file system has not alloocated the space yet.
If the LDM is running when the queue space is needed past what is available,
your LDM will die. Since you are running LDM 5.0.8, this may be a problem.

Steve Chiswell
Unidata User Support



>From: Tom Heinrichs <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200104161906.f3GJ62L10350

>Hello all,
>
>I had the following problem over the weekend and lost my LDM feed:
>
>tail of ldmd.log:
>
>Apr 13 22:36:33 inisas02 pqexpire[26056]: > Recycled  31840.309 kb/hr (
>6573.200 prods per hour)
>Apr 13 22:41:34 inisas02 pqexpire[26056]: > Recycled  31876.624 kb/hr (
>6578.975 prods per hour)
>Apr 13 22:43:58 inisas02 typhoon[26059]: Connection reset by peer
>Apr 13 22:44:28 inisas02 typhoon[26059]: run_requester: 20010413224343.063
>TS_ENDT {{UNIDATA,  ".*"}}
>Apr 13 22:44:34 inisas02 rpc.ldmd[26055]: child 26059 terminated by signal
>11
>Apr 13 22:44:34 inisas02 pqact[26058]: Interrupt
>Apr 13 22:44:34 inisas02 pqbinstats[26057]: Interrupt
>Apr 13 22:44:34 inisas02 pqexpire[26056]: Interrupt
>Apr 13 22:44:34 inisas02 rpc.ldmd[26055]: Interrupt
>Apr 13 22:45:45 inisas02 rpc.ldmd[26055]: Terminating process group
>
>I'm running 5.0.8 at the current momement (although I'll be moving to
>back to 5.1.2 later this week once another issue with endusers is worked
>out).
>
>There is also a core dump in ~ldm dated April 13 22:44Z
>
>As part of my upgrade, I changed ldmd.conf to
>
>request UNIDATA
>        ".*"
>                typhoon.atmos.ucla.edu
>from:
>
>request HDS
>        ".*"
>                typhoon.atmos.ucla.edu
>
>LDM has run virtually flawlessly for the past year since I installed
>it. I've had this failure twice in a week now.
>
>Any ideas?
>
>Thanks,
>Tom
>
>