[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #MNK-989154]: comparing, contrasting two ldm servers



Hi SJ,

Thanks for the login info.  It has helped me to get a better idea of your
setup AND implement some changes.

Here goes:

- I downloaded the source distribution of ldm-mcidas-2012 to the $HOME
  directory of 'ldm'

- I unpacked the compressed tarball which created the ~ldm/ldm-mcidas-2012
  directory structure

- I created the file ldm-mcidas.cshrc in the ~ldm/ldm-mcidas-2012/src director

  This file contains defines for a variety of environment variables
  needed to build the ldm-mcidas package.  The idea is that the person
  building the ldm-mcidas-2012 decoders source this file before running
  configure in the ~/ldm-mcidas-2012/src directory.

- I built a good bit of the ldm-mcidas package before running into
  a problem:

  - a couple of ldm-mcidas decoders (not pnga2area) need to use the
    netCDF library.  I tried using the netCDF stuff I found in
    ~ldm/netcdf/(lib|include), but this installation appears to be
    for a different system:

ld: warning: ignoring file /Users/ldm/netcdf/lib/libnetcdf.a, file was built 
for archive which is not the architecture being linked (x86_64)

    I could finish the build of the ldm-mcidas-2012 package if I knew
    where to find a 64-bit installation of the netCDF package.  I would
    like to do this in order to create a binary distribution that others
    could use directly.

- I copied the newly built ldm-mcidas decoder 'pnga2area' from the
  ~ldm/ldm-mcidas-2012/src/decode directory to ~/decoders (after
  making the ~ldm/decoders directory)

After doing this, I did a 'rehash' and then stopped/started the LDM.
As far as I can tell, the errors related to 'pnga2area' have gone
away.

Next:

I noticed that you built GEMPAK6.6 in the ~ldm/Gempak directory.  I took
the liberty to:

- create the NAWIPS symbolic link in ~ldm/Gempak and point it at GEMPAK6.6.0

- edited ~ldm/Gempak/NAWIPS/Gemenviron and change the references to GEMPAK6.6.0
  to NAWIPS

- stopped the LDM

- logged off and then back on

  I did this so that the environment in which the LDM runs was updated to
  use the mods made by the two steps above.

Next:

I watched the LDM log file to see if problems continued to be reported.
There is still an error being reported for the dcnldn decoder.  This
needs more investigation.

Next:

While looking at the problems with the dcnldn decoder, I happened to
take a look at the GEMPAK log files which live in:

/Volumes/desert/data/realtime/gempak/logs/

I found that the log files were HUGE.  For instance, the log file for
dcmetr (the GEMPAK METAR decoder) was 781 GB !!!

I checked the crontab entries for 'ldm' and verified my suspicion that
the GEMPAK log files are not being rotated.  I added a cron entry to
do the rotation:

#
# rotate GEMPAK logs
15 23 * * * util/dcrotatelog.csh >/dev/null 2>&1

This was done after copying dcrotatelog.csh from ~ldm/Gempak/NAWIPS/bin to
~ldm/util and editing the newly copied file to source the Gemenvironm
file pertinent to the GEMPAK distribution being used (6.6.0).

The GEMPAK log files should now be rotated once per day.

I also added a cron entry to rotate the LDM log files:

#
# rotate logs
#
00 23 * * * bin/ldmadmin newlog

Comment:

I see that the load average on your machine is very high (over 11), and
that the top culprit is dcmetr:

Processes: 198 total, 10 running, 13 stuck, 175 sleeping, 661 threads       
18:48:02
Load Avg: 11.39, 11.31, 11.31  CPU usage: 35.44% user, 5.20% sys, 59.35% idle
SharedLibs: 3124K resident, 3728K data, 0B linkedit.
MemRegions: 34317 total, 2610M resident, 67M private, 1602M shared.
PhysMem: 1466M wired, 4252M active, 10G inactive, 16G used, 29M free.
VM: 533G vsize, 1097M framework vsize, 355966767(0) pageins, 3462083(0) 
pageouts.
Networks: packets: 2606927909/3558G in, 720084341/84G out.
Disks: 415166778/13T read, 5421107977/41T written.

PID    COMMAND      %CPU  TIME     #TH    #WQ  #POR #MREG RPRVT  RSHRD  RSIZE  
VPRVT
94186  dcmetr       58.3  148 hrs  1/1    0    23   38    1060K  1912K  2680K  
33M
94185  dcmetr       64.5  190 hrs  1/1    0    23   38    1064K  1912K  2640K  
33M

I am willing to bet that the load average will go down after the GEMPAK
log files are rotated (but I could be wrong :-).

Next:

I compared volumes of data being ingested by your machine and that being
relayed through the Unidata-operated IDD toplevel relay, idd.unidata.ucar.edu.
Since you are requesting portions of several of the larger feeds (by volume),
the only feed I was really able to compare was HDS (WMO == HDS|IDS|DDPLUS).
The comparisons of volumes look good:

Unidata HomePage
http://www.unidata.ucar.edu

  Projects -> Internet Data Distribution
  http://www.unidata.ucar.edu/projects/index.html#idd

    IDD Current Operational Status
    http://www.unidata.ucar.edu/software/idd/rtstats/

      Statistics by host
      http://www.unidata.ucar.edu/cgi-bin/rtstats/siteindex

idd.unidata.ucar.edu
http://www.unidata.ucar.edu/cgi-bin/rtstats/siteindex?idd.unidata.ucar.edu

measwx.meas.ncsu.edu
http://www.unidata.ucar.edu/cgi-bin/rtstats/siteindex?measwx.meas.ncsu.edu

The fact that the volume of HDS data is the same on your MacOS-X machine
and the Unidata idd.unidata.ucar.edu cluster shows that the LDM on your
machine is not experiencing the problems that have been encountered by
other Unidata sites. This is good news to say the least!  It is likely
that we will poke around on your machine to further investigate this.

Things to be done:

- figure out what is going on with the dcnldn problems

- verify that your GEMPAK log files are getting rotated once per day

- do a full build of the ldm-mcidas decoders so that a binary distribution
  can be created

  NB: remember that in order to accomplish this, I need to find a netCDF
  build that is matched to your machines architecture.

- I didn't really checkout data scouring; this may need to be looked into.

Got to run...

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: MNK-989154
Department: Support LDM
Priority: Normal
Status: Closed