[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: problem with CONDUIT ETA/GFS since 4 Jan 2005



David et al.

Thank you for providing your feedback on how CONDUIT can best serve the
community.

The recent change in queueing mechanisms used for supplying data to the LDM
host for CONDUIT increased the total throughput maximum of the feed from
1.8-2.0 GB per hour to 2.5 - 2.7 GB per hour during the period of maxmimum
data availability. That being times when RUC, NAM, GFS and ensemble data
are all posting simultaneously. The largest of these files is generally the 20KM
RUC bgrib files, which run approximately 53MB. With the addition of .5 degree
GFS, several users had remarked that the additional files were causing others
they had received previously to slip posting, which resulted in the
change to a paralleled insertions which allowed newly posted data to begin
insertion without having to wait for the other data to complete before
strting to insert. This is the resulting concurency of different times
and order.

As a blend of these two methods, I have also prepared for insertion
of files as a separate stream per model, but serially in the order
of file posting within a model output. The result should be a return to
ordering of forecast hours as you saw previously, with peak times when
multiple models are posting seeing interleving.

I will notify the list when this change in strategy is in place.

Steve Chiswell
Unidata User Support





On Mon, 10 Jan 2005, David Ovens wrote:

> All,
>
> I am in agreement with Pete about the problematic aspects of this
> change.  We, too, rely heavily on timely arrival of the first few
> hours of the GFS forecast to fire up our real-time MM5 runs.  I have
> successfully modified pre-processing to make optimal use of these
> grids arriving in serial fashion -- not only to initialize the model
> but to continue to provide 3-hour boundary conditions for the model
> throughout its 72-hour forecast.  I have noted significant CONDUIT
> delays since Jan 5 and have had to manually start/stop/reboot some ftp
> retrieval scripts.  These backup scripts which ftp the grids from
> tgftp.nws.noaa.gov and ftpprd.ncep.noaa.gov do actually work in a
> serial/parallel fashion, but their performance has degraded and
> continues to degrade with time -- probably due to ever-increasing load
> on those servers.
>
> CONDUIT, until this most recent fundamental change, has provided the
> most reliable and fastest method of initializing our runs.  I strongly
> support a return to the serialized version.
>
> Sincerely,
>
> David Ovens
> --
> David Ovens            e-mail: address@hidden
> Research Meteorologist    phone: (206) 685-8108
> Dept of Atm. Sciences      plan: Real-time MM5 forecasting for the
> Box 351640                        Pacific Northwest
> University of Washington          http://www.atmos.washington.edu/mm5rt
> Seattle, WA  98195               Weather Graphics and Loops
>                                   http://www.atmos.washington.edu/~ovens/loops
>
>
> On Mon, Jan 10, 2005 at 03:13:47PM -0600, Pete Pokrandt wrote:
> >
> > All,
> >
> > Apparently these emails were not distributed to the entire conduit list.
> >
> > Here's what's going on.
> >
> > In a Jan 5 email from Steve Chiswell to Jerry, Me and support-conduit:
> >
> >  >Jerry,
> >  >
> >  >Yesterday at 1830Z we implemented a parallel queueing scheme at the
> >  >NWS that we hope will improve the timeliness of data being ingected into
> >  >the CONDUIT data stream. Any feedback you can provide on how
> >  >this affects your reception would be greatly appreciated.
> >  >
> >  >Since data will be inserted in parallel, you will notice that multiple
> >  >model runds and forecast times will probably be interspersed where
> >  >previously they had been serialized.
> >  >
> >  >I watched the 00Z GFS last night, and the posting gap between f084 and
> >  >f132 was matched on the FTP server at 0422Z and later at 0509Z, the
> >  >other grids were posted to the NWS servers, so all appears to be
> >  >behaving correctly on this end.
> >  >
> >  >Steve Chiswell
> >  >Unidata User Support
> >  >
> >
> > An email from Me to Steve last Friday (Jan 7)
> >
> > > Steve,
> > >
> > > Since the parallel queueing scheme was implemented, I have noticed two
> > > things, both are problematic.
> > >
> > > 1) The lag goes way up since so much more data is being inserted into
> > > the queue in a shorter amount of time. I was getting lag times of 
> > > 3000-4000
> > > seconds in peak gfs times.  I moved my ldm queue on f5 to a faster disk, 
> > > and
> > > that helped, but it's still getting up to 300-400 seconds.
> > >
> > > 2) The biggest problem I see, is that now it takes MUCH longer for the
> > > early hours of a forecast run to complete.  Most of my model plotting
> > > scripts and all of our real-time mesoscale modelling efforts here
> > > take advantage of the fact that an entire model run doesn't need to be
> > > complete before we can start.  Since the parallel queueing was enabled,
> > > the 00 hour forecast of the eta model for example, takes over an hour
> > > to get here, where previously it was taking maybe 5 minutes.
> > >
> > > If this is how it's going to be, I'd much prefer the old serial ingest,
> > > or maybe some kind of blend, so the first 12 hours or so (or 48 or
> > > whatever) of a model run can get priority.
> > >
> > > It really hurts us to have the 00 hour forecast complete around the
> > > same time as the later forecasts, even if the entire model run gets
> > > in faster as a result.
> > >
> > > For what it's worth.
> > >
> > > Pete
> > >
> >
> > Steve's reply on Friday [again to me, Jerry and support-conduit]
> >
> >  >Pete,
> >  >
> >  >In pushing the queing lag time from the NWS computer queueing the data
> >  >to the NWS computer doing the data insertion, some of the latency is
> >  >now in the LDM queue as the volume you are receiving is increased
> >  >into a shorted time window- whereas before it was all in the insertion.
> >  >
> >  >I agree that the parallel insertion is at some expense of the early
> >  >files since the slug of insertion starts slowing that down and
> >  >we'll have to see if we can optimize the insertion.
> >  >
> >  >Steve Chiswell
> >  >Unidata User Support
> >
> > FYI.
> >
> > Pete
> >
> >
> >
> >
> > In a previous message to me, you wrote:
> >
> >  > Yes.  We see the same problem.
> >  >
> >  >-----Original Message-----
> >  >From: address@hidden
> >  >To: address@hidden
> >  >Sent: 1/10/2005 2:41 PM
> >  >Subject: problem with CONDUIT ETA/GFS since 4 Jan 2005
> >  >
> >  >Hello,
> >  >  As of 4 Jan 2005, I've noticed that the model products are being
> >  >sent out in a mixed up time sequence.  Previously, all the F000 fields
> >  >were sent, then F003, and so on.  Now, the times are all mixed up with
> >  >F000 fields still coming in even though the F072 fields have just
> >  >started.
> >  >In the past, by T+2.5 hours, the ETA212 was up to having at least F048
> >  >finished, thus I was able to start a local WRF run.  Now, I have to wait
> >  >much longer for F048 to complete, delaying substantially the start of my
> >  >real time WRF runs.  Is any one else seeing this sort of problem?
> >  >
> >  >Thanks.
> >  >Mike
> >  >
> >  >
> >  >Below is sample:
> >  >
> >  >Jan 10 20:32:47 pqutil:    23958 20050110203228.114 CONDUIT 078
> >  >/afs/.nwstg.nws.noaa.gov/ftp/SL.us008001/ST.opnl/MT.nam_CY.18/RD.2005011
> >  >0/PT.grid_DF.gr1/fh.0060_tl.press_gr.awip3d
> >  >!grib/ncep/ETA_84/#212/200501101800/F060/VVEL/350_mb! 000078
> >  >Jan 10 20:32:48 pqutil:    23958 20050110203243.766 CONDUIT 169
> >  >/afs/.nwstg.nws.noaa.gov/ftp/SL.us008001/ST.opnl/MT.nam_CY.18/RD.2005011
> >  >0/PT.grid_DF.gr1/fh.0045_tl.press_gr.awip3d
> >  >!grib/ncep/ETA_84/#212/200501101800/F045/TMP/225_mb! 000169
> >  >Jan 10 20:32:48 pqutil:    26942 20050110203239.440 CONDUIT 300
> >  >/afs/.nwstg.nws.noaa.gov/ftp/SL.us008001/ST.opnl/MT.nam_CY.18/RD.2005011
> >  >0/PT.grid_DF.gr1/fh.0024_tl.press_gr.awip3d
> >  >!grib/ncep/ETA_84/#212/200501101800/F024/SPFH/575_mb! 000300
> >  >
> >  >
> >  >--
> >  >Mike Leuthold
> >  >Atmospheric Sciences/Institute of Atmospheric Physics
> >  >University of Arizona
> >  >address@hidden
> >  >520-621-2863
> >  >
> >  >"Is there some difference between the despotism of the monarch and
> >  > the despotism of the multitude?"
> >  >
> >  >Edmund Burke
> >
> >
> > --
> > +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+
> > ^ Pete Pokrandt                    V 1447  AOSS Bldg  1225 W Dayton St^
> > ^ Systems Programmer               V Madison,         WI     53706    ^
> > ^                                  V      address@hidden       ^
> > ^ Dept of Atmos & Oceanic Sciences V (608) 262-3086 (Phone/voicemail) ^
> > ^ University of Wisconsin-Madison  V       262-0166 (Fax)             ^
> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>+
>