[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #SBB-325304]: Re: 20110712: CONDUIT request -- fire weather grids



Hi Justin,

re: to be clear: ldm2 is the system on which the fireweather products are being
inserted?

> Yes, ldm2 is the system with the parallel NAM fireweather grids, sorry I 
> didn't clarify that.

Very good.  I just wanted to make sure we were talking about the same things.

re: errors reported by gribinsert on ldm2

> Looking closer at the errors, our process was unable to connect to ldm2 to
> run gribinsert, retries appear to be successful. I wonder if the load was
> just too high?

Hmm... a quick look at the metrics.txt file that you made available did now
show excessively high load averages -- they varied mostly between 1 and 1.8
for the period covered by the metrics file (August 2 - August 18).  The
failure to connect must have been caused by something else.
 
re: parallel NAM references actually refer to fireweather products
> Yes, the fireweather products are being generated within the parallel NAM.

Very good.  Again, I just wanted to make sure that we were/are talking about
the same things.

re: getting the metrics.txt file(s)

> No problem, I've placed the latest metrics.txt file on our server, you can
> pull it at:
> 
> http://www.ftp.ncep.noaa.gov/data1/nccf/com/tmp/metrics.txt

Got it, thanks.  The plots look decent:

- low load averages

- age of the oldest products in the queue vary between 0.5 and 3.5 hours

  The 0.5 hours for the smallest age of the oldest product is a bit
  concerning... we like to see upstreams have a minimum of 1 hour of
  data at all times.

- memory use is consistent and not out of line for what the machine is
  doing

- the number of downstream connections varies between 1 and 3

  Does this machine feed any machines that are involved in the
  "operational" CONDUIT distribution via the IDD?

re:
> Sounds good, I'll be sure to put an updated copy of metrics.txt on Monday and
> I'll check the logs to see if we had any connection errors to ldm2.

OK, thanks.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: SBB-325304
Department: Support CONDUIT
Priority: Normal
Status: Closed