[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #MRY-727013]: My LDM seems to have missed a model hour?



Hi Brian,

re:
> Not sure who support goes to, or who to ask.

I drove your email into our inquiry tracking system so that a number
of Unidata folks can see and respond when appropriate.

re:
> Why is weather missing whole GFS model hours sometimes?  Is there a
> checklist somewhere of what an LDM admin should do now and then, or
> anytime there is a problem?

The afternoon before last (Tuesday), I happen to notice that the
Unidata-Wisconsin (IDD feedtype UNIWISC aka MCIDAS) imagery was
not being updated on weather.rsmas.miami.edu (I routinely load
images in McIDAS-X/IDV from the ADDE server that runs on weather).
After verifying that there was no problem with the image generation
(currently being done on a virtual machine in the Amazon EC2 West
cloud, but soon to be switching to a virtual machine instance in
the Microsoft Azure cloud), I logged onto weather as 'ldm' and started
poking around.  I found that virtually no data was being received
from the 4 ~ldm/etc/ldmd.conf REQUESTs to the top level IDD relay
that we run that were, in fact, active.  This was the cause of
the data outage that you are referring to above.

A restart of the LDM on weather.rsmas.miami.edu did _not_ clear
the data receipt problem, so I switched my investigations to the
IDD relay cluster backend machine which was servicing the feed
REQUESTs.  They were also running and not reporting any errors.
In order to see if the existing feed REQUESTs had become "stale"
(no data flowing for some unknown reason), I restarted the LDM
on the IDD relay cluster backend machine.  This seemed to be
successfully restore data flow to weather... progress.  I
then fired off an email to Steve Emmerson (LDM developer) and
Mike Schmidt (head Unidata system administrator) to see if either
of them could shed some light on what could have caused the data
flow problem.

Yesterday I met with Steve and Mike Schmidt to continue troubleshooting
the incident.  Steve is looking at the problem now to see if there 
anything that can be gleaned from LDM and system log files both on
weather and on the particular relay cluster backend macine.

re:
> I'll come across town sometime. I sit at Mesa lab this summer.

I would like to meet with you at some point in the not too distant
future to go over the current software configurations on weather
and to map out a plan for what needs items that were scheduled to
have been done but weren't and where to go from here.  I asked
Doug yesterday afternoon if he had been in touch with you recently,
and to relay to you my desire to get together sometime for this talk.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: MRY-727013
Department: Support Datastream
Priority: Normal
Status: Closed