[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[CONDUIT #PGY-808322]: Upcoming NCEP CONDUIT maintenance



Hi,

I just wanted to update everyone on the results of the CONDUIT troubleshooting
that Kyle and I did yesterday (Tuesday, June 24):

- the ability for two Unidata machines to REQUEST data from the Boulder
  CONDUIT top level relay, ncepldm4.woc.noaa.gov, was restored
  at approximately 14 UTC

- the cause of the problem was traced to two things:

  - the final step in the building/installation of ldm-6.8.1 on the
    real server backends of the ncepldm4.woc.noaa.gov cluster was not
    performed.  To wit:

    <as 'ldm'>
    cd ~ldm/ldm-6.8.1/src
    make distclean
    ./configure && make && make install

    <as 'root'>
    cd ~ldm/ldm-6.8.1/src
    make install_setuids

    The net effect of this was that the LDM server was not able to
    grab the privileged port 388, so it was ended up listening on a 
    random, non-privleged port assigned by the operating system.

    LVS was/is configured on the cluster director to forward LDM feed REQUESTs
    to port 388 on the real server backend machines,
    ncep-ldm0.boulder.noaa.gov and ncep-ldm1.boulder.noaa.gov.  Since
    the LDMs on ncep-ldm[01] were not listening on port 388, no REQUESTs
    were seen and so none could be honored.

  - there were no ALLOW(s) in the LDM configuration file (~ldm/etc/ldmd.conf)
    for the cluster director on either of the real server backends of the
    cluster.

    This would not cause the inability to receive/honor feed REQUESTs, but
    it was a problem nonetheless.

I want to commend Kyle for doing an excellent job of locating and fixing the
problems found during our hour+ long troubleshooting exercise!  As soon as
he performed the final installation step (shown above), CONDUIT products
started flowing to the Unidata cluster front end (conduit2.unidata.ucar.edu)
and then to our top level relay cluster (idd.unidata.ucar.edu) and then
to the community REQUESTing data from us.  Service was also restored
to other Unidata community top level CONDUIT relay sites; I believe that
Kyle mentioned seeing connections to UWisconsin and UIllinois and
possibly to PennState while we talked. CONDUIT data relay has been working
nicely since the problems were found and fixed.

On behalf of the Unidata community, we want to thank those involved,
especially Kyle, for getting CONDUIT relay working properly/fully again!

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: PGY-808322
Department: Support CONDUIT
Priority: Normal
Status: Closed