[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #XDI-772585]: Problems hanging our server possibly related to LDM?



Greg,

Can we have an account on the machine?

mike

> Mike,
> ingest is running Red Hat Linux. cron was an early suspect but
> nothing is running in any cron jobs at :45. We just keep going back to
> the fact that if we run "ldmadmin stop" the problem stops. The fact that
> no error messages appear in either the system logs or the ldm logs makes
> it very hard to trace the source of the problem. Unfortunately we can't
> systematically shut down ldm activities to try and isolate the problem
> which is what I'd like to try next. We are supporting 2 projects for the
> next month or so and we are feeding some data downstream to others
> through ldm.
> 
> Greg
> 
> Unidata LDM Support wrote:
> > Greg,
> >
> > What type of OS is running on ingest?  My first suspect for a problem
> > occurring at :45 after is cron.  Assuming linux, look in /etc/*cron*,
> > probably mrtg or the like...
> >
> > mike
> >
> >> Greg,
> >>
> >>> Our problems continue with LDM and our server hanging/ Today the
> >>> load reached 6.o and ssh logins weren't working. ldm services had the
> >>> top 3 in the "TOP" command list.
> >> What was the CPU usage of the LDM processes?
> >>
> >>> We rebooted the machine. For odd
> >>> reasons, ldm is also not auto restarting when the machine comes up - I
> >>> had to delete the ldmd.pid file and the queue before I could restart.
> >> That makes sense.  If the file "ldmd.pid" exists, then another LDM
> >> is likely running.  An "ldmadmin clean" removes this file.
> >>
> >>> The last item in the ldmd.log file before the problem was:
> >>>
> >>> Oct 22 22:17:17 ingest usgodae3.fnmoc.navy.mil[20200] WARN: Future
> >>> product from "usgodae3.fnmoc.navy.mil".  Fix local or ingest clock.
> >>> 73413 20071022221801.297   FNMOC 000
> >>> US058GMET-GR1mdl.0058_0240_00000F0OF2007102218_0105_000100-000000wnd_ucmp
> >> Either your clock is slow or the clock on usgodae3 is fast, or both.
> >> In any case, they should be fixed.
> >>
> >>> Any ideas how ldm could get so busy? The problem occurred at ~1645 this
> >>> afternoon. When this problem occurs it always seem to be at :45. . .
> >> The only thing I can think of is that a bunch of data arrives at that
> >> time.  Is a bunch of data being created on usgodae3 at that time?
> >>
> >>> Greg
> 
> >> Regards,
> >> Steve Emmerson
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: XDI-772585
> > Department: Support LDM
> > Priority: Normal
> > Status: Closed
> >
> 
> 
> --
> 
> ~~N~A~T~I~O~N~A~L~~C~E~N~T~E~R~~F~O~R~~A~T~M~O~S~.~~R~E~S~E~A~R~C~H
> Greg Stossmeister                      e-mail: address@hidden
> NCAR/EOL                               phone: (303)497-8692
> P.O. Box 3000                          web: http://www.eol.ucar.edu
> Boulder, CO 80307-3000
> ~~~~~~~~E~A~R~T~H~~~O~B~S~E~R~V~I~N~G~~~L~A~B~O~R~A~T~O~R~Y~~~~~~~~
> 
> 


Ticket Details
===================
Ticket ID: XDI-772585
Department: Support LDM
Priority: Normal
Status: Closed