[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030521: LDM upgrade on Plymouth State machines (cont.)



>From:  Jim Koermer <address@hidden>
>Organization: Plymouth State
>Keywords: 200305212159.h4LLxclL042738 LDM IDD pqing

Hi Jim,

>I see that you installed the LDM upgrade on pscwx, but there seems to be 
>a problem with the pqing from my 158.136.73.144 machine.

When I left off yesterday evening, there were no errors about "Trying to
re-open connection on port 5000" in the ~ldm/logs/ldmd.log file.  Is
this the problem you are referring to?  If yes, doesn't the line:

May 22 04:28:24 pqing[82292]: NET "158.136.73.144" 5000

indicate that pqing did get successfully reconnect?  If yes, then pqing
had a problem with port 5000 from 04:26:44 until 04:28:24 and then
successfully reconnected.

Since I don't know what is supposed to be coming from this port, can
you tell me if you are seeing other indications that something is not
working correctly?

Did you do anything to get this working?

On to what I did on pscwx:

- changed 'ldm' account so its HOME is /home/ldm (it was /home/ldm/ldm-5.1.4)

- moved relevant files from /home/ldm/ldm-5.1.4 up to /home/ldm

- changed the link /usr/local/ldm to point at /home/ldm (it was
  /home/ldm/ldm-5.1.4)

- created the various runtime links in /home/ldm to point at the LDM
  distribution to use

- created the links in /home/ldm for the data and logs directory

- built and installed the latest LDM release, LDM-6.0.11

- modified the 6.0.11 version of ldmadmin to set the queue size to
  match what you were already using and to keep 8 log files.  These were
  the only changes I made/needed to make in ldmadmin.

- stopped the old LDM, changed the runtime link to point at the
  LDM-6.0.11, and then started the new LDM

At this point, I monitored the log files to make sure that data was
flowing in from upstream sites (e.g., FNEXRAD radar composites
and NNEXRAD radar data from atm.geo.nsf.gov), and things looked to
be working.

Here are some things that I wanted to talk to you about:

1) with LDM-6 there is no need to create aliases for upstream feed
   hosts to split data feeds.  Given this, the ldmd.conf entries:

request NNEXRAD "/p(N0R|N0Z)..." atm1.geo.nsf.gov
request NNEXRAD "/p(N1P|NTP)..." atm2.geo.nsf.gov
request NNEXRAD "/p(N0V|NVW)..." atm3.geo.nsf.gov

   should be changed to:

request NNEXRAD "/p(N0R|N0Z)..." atm.geo.nsf.gov
request NNEXRAD "/p(N1P|NTP)..." atm.geo.nsf.gov
request NNEXRAD "/p(N0V|NVW)..." atm.geo.nsf.gov

   or, since LDM-6 is much more efficient at moving data than LDM-5
   was, combine the requests into a single line:

request NNEXRAD "/p(N0R|N0Z|N1P|NTP|N0V|NVW)..." atm.geo.nsf.gov

   The change to the "real" name for atm will allow real time
   statistics displays to be consistent (atm reports its statistics
   using atm.geo.nsf.gov).

   I changed the three requests to use atm.geo this morning.  I
   suggest we consolidate the three requests into the single
   one that I listed above.

   By the way, comments in ldmd.conf say that the NNEXRAD feed from
   atm is only to be used when noaaport4 does not have data.  The
   pqing to/from noaaport4 seems to be working, so you might want
   to comment out all requests to atm for NNEXRAD data as this is
   using bandwidth unnecessarily.  If you want to kee the feed
   going, but use less bandwidth, then the single request line
   I would use is:

request NNEXRAD "/p(N0R|N0Z|N1P|NTP|N0V|NVW)..." atm.geo.nsf.gov ALTERNATE

   The ALTERNATE addition to the request line tells the LDM to instruct
   the upstream feeder to ask you if you want a product before sending
   it.  The current three lines, or single line with the combined
   request tells the upstream LDM to go ahead and send the products.
   Any duplicate product detection will be done on your end.  This
   uses a lot more network bandwidth than specifying the feed as
   ALTERNATE.

2) I am not sure what the NIMAGE request for "^rad_" products is.  A
   notifyme to atm for the past hour shows no product that matches
   this pattern:

notifyme -vxl- -f NIMAGE -h atm.geo.nsf.gov -o 3600 -p ^rad_
May 22 14:00:44 notifyme[55494]: Starting Up: atm.geo.nsf.gov: 
20030522130044.044 TS_ENDT {{NIMAGE,  "^rad_"}}
        NOTIFYME(atm.geo.nsf.gov) returns OK
May 22 14:00:44 notifyme[55494]: NOTIFYME(atm.geo.nsf.gov): OK
        NOTIFYME(atm.geo.nsf.gov) returns OK
May 22 14:06:11 notifyme[55494]: NOTIFYME(atm.geo.nsf.gov): OK

   (the notifyme simply times out).  Given this, I commented the
   request out this morning.

3) real time statistics are not being received by our real time
   statistics machine, rtstats.unidata.ucar.edu, from pscwx.

   Oops, I found a typo of mine in the rtstats line in ldmd.conf.
   I just changed that, and we are now getting real time stats
   from pscwx:

4) I added an allow line for met-62.oswego.edu since they are moving
   from met-05 to met-62.

http://www.unidata.ucar.edu/staff/chiz/rtstats/siteindex.shtml?pscwx.plymouth.edu

After changing the NNEXRAD requests to use atm.geo, removing the NIMAGE
request for ^rad_ products, and correcting my rtstats typo, I stopped
and restarted your LDM.  The messages in ~ldm/logs/ldmd.log seem to
indicate that things are working, including pqing.  Please let me know
if this is not correct.  Ingestion from noaaport4 seems to be working
since real time stats for NNEXRAD show that the products are coming
from pscwx indicating that the pqing is sticking the products into the
queue as requested.

After the restart, I edited ~ldm/etc/ldmd.conf.cyclone and added
reporting of real time stats and changed the data request for WMO from
the IP address for squall.atmos.uiuc.edu to its fully qualified
hostname.  I also changed the commented out request for WMO data from
the IP of flood.uiuc to the fully qualified hostname in
~/etc/ldmd.conf.  The reason for this is the same as the reason for not
having to use aliases for machines.  LDM-5 accumulated all rpc.ldmd
request lines to a host into one request; LDM-6 does not do this.  This
makes splitting feeds a lot easier, but requires the user to try and
aggregate feed requests where possible to keep down the number of
rpc.ldmd invocations on his machine AND on its upstream host.

Onto LDM installs on your other machines.  I tried logged onto cyclone
but could not become root, so I couldn't become ldm and do the upgrade.
The logon to snow does work, so I will upgrade it today.  What
about mammatus?

Please let me know if you see anything wrong on pscwx (and exactly how
you are seeing that shows the problem) and let me know when I am added
to the list of folks that are allowed to su on cyclone.  Again, I
recommend combining the three requests for NNEXRAD data to atm into a
single line.  I added the single line to the ldmd.conf file and
commented it out so that switching will be as easy as uncommenting the
single line and commenting out the three and then stopping and
restarting the LDM.

Cheers,

Tom
--
+-----------------------------------------------------------------------------+
* Tom Yoksas                                             UCAR Unidata Program *
* (303) 497-8642 (last resort)                                  P.O. Box 3000 *
* address@hidden                                   Boulder, CO 80307 *
* Unidata WWW Service                             http://www.unidata.ucar.edu/*
+-----------------------------------------------------------------------------+