[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Top level CONDUIT relay



Pete et al.,

I have noticed 2 particular problems:

1) ldm2.woc.noaa.gov drops connection and reconnects every 5 minutes
exactly at the 0 and 5's.

2) ldm2 is running significantly slower than ldm2

I added redundant requests lines to both ldm1 and ldm2 here at Unidata
as of 18Z, and find that virtuallya ll (except the unique status files
generated by ldm1) of the data is coming from ldm2:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?CONDUIT+daffy.unidata.ucar.edu

http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_vol_nc?CONDUIT+daffy.unidata.ucar.edu

For the immediate time, it appears that ldm2 is the preferable route,
but I can't guarantee that the every 5 minute stop/restart of that
connection won't eventually
cause the feeds to fall behind there as well.

Justin,
do you know what is triggereing the re-connects for my downstream ldm to
ldm2 every 5 minutes?

Steve

------

Steve Chiswell
Unidata User Support



On Tue, 2007-06-19 at 15:43 -0500, Pete Pokrandt wrote:
> As Steve mentioned, feeding CONDUIT from ldm1 helped for a little while, 
> but now it's back to big (>2000 seconds) lags and incomplete files again.
> 
> Pete
> 
> 
> Steve Chiswell wrote:
> > Chi,
> >
> > It seems that the connection to ldm2 breaks every 5 minutes on the 0 &
> > 5's
> > Jun 19 18:50:10 daffy ldm2.woc.noaa.gov[10740] ERROR: Disconnecting due to 
> > LDM failure; Connection to upstream LDM closed 
> > Jun 19 18:55:09 daffy ldm2.woc.noaa.gov[10740] ERROR: Disconnecting due to 
> > LDM failure; Connection to upstream LDM closed 
> > Jun 19 19:00:09 daffy ldm2.woc.noaa.gov[10740] ERROR: Disconnecting due to 
> > LDM failure; Connection to upstream LDM closed 
> >
> > This may be related to a cron I'm not familiar with that Justin
> > mentioned restarting the LDM for some reason.
> >
> > Steve Chiswell
> > Unidata User Support
> >
> > On Tue, 2007-06-19 at 14:49 -0400, Chi.Y.Kang wrote:
> >   
> >> Huh, i thought you guys were on the system.  let me take a look on ldm2
> >> and see what is going on.
> >>
> >>
> >> Justin Cooke wrote:
> >>     
> >>> Chi.Y.Kang wrote:
> >>>       
> >>>> Steve Chiswell wrote:
> >>>>  
> >>>>         
> >>>>> Pete and David,
> >>>>>
> >>>>> I changed the CONDUIT request lines at NSF and Unidata to request data
> >>>>> from ldm1.woc.noaa.gov rather than ncepldm.woc.noaa.gov after seeing
> >>>>> lots of
> >>>>> disconnect/reconnects to the ncepldm virtual name.
> >>>>>
> >>>>> The LDM appears to have caught up here as an interim solution.
> >>>>>
> >>>>> Still don't know the cause of the problem.
> >>>>>
> >>>>> Steve
> >>>>>       
> >>>>>           
> >>>> I know the NCEP was stop and starting the LDM service on the ldm2 box
> >>>> where the VIp address is pointed to at this time.  how is the current
> >>>> connection to LDM1?  is the speed of the conduit feed acceptable?
> >>>>   
> >>>>         
> >>> Chi, NCEP has not restarted the LDM on ldm2 at all today. But looking
> >>> at the logs it appears to be dying and getting restarted by cron.
> >>>
> >>> I will watch and see if I see anything.
> >>>
> >>> Justin
> >>>       
> >>     
> 
> 
-- 
Steve Chiswell <address@hidden>
Unidata