[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #QWE-565829]: Large Log files



Hi Heather,

re:
> I just updated my LDM to 6.13.10 a couple of weeks ago.
> 
> It appears to be working just fine, but on my server that is attached
> to our novra receiver, I am getting TONS of warning and error messages:
> 
> 20190401T131409.294916Z     noaaportIngester[21653]         
> productMaker.c:pmStart() ERROR Missing fragment in sequence, last 18/898600 
> this 20/898600
> 
> 20190401T131409.300972Z     noaaportIngester[21652]         
> productMaker.c:pmStart() WARN  Gap in packet sequence: 129230376 to 129230378 
> [skipped 1]
> 
> 20190401T131409.304927Z     noaaportIngester[21647]         
> productMaker.c:pmStart() WARN  Gap in packet sequence: 443462068 to 443462070 
> [skipped 1]
> 
> 20190401T131409.326242Z     noaaportIngester[21647]         
> productMaker.c:pmStart() WARN  Gap in packet sequence: 443462078 to 443462080 
> [skipped 1]

These 'Gap' messages indicate that there are errors being seen in your
NOAAPort data ingest.

re:
> The messages are so frequent that my log files are larger than 20G,
> and are filling up my disk!

All LDM-related log files should be rotated.  Log file rotation
is most easily handled by use of the LDM utility 'newlog', or,
in the case of NOAAPort ingest log files, via the use of the
script 'nplog_rotate' that can be found in the ~ldm/bin directory.

I can help you setup log file rotation if you like.

re:
> My C/N and Signal strength look fine, and I am getting all my data
> (except of course, the day my disk filled up!).

High C/N is, of course, very important.  Having high C/N, however, is
not a guarantee that there are no errors in your NOAAPort downlink/ingest.

re:
> Is the newer version just more verbose?

The newer version is a LOT more verbose.

re:
> Is there another problem that could be occurring?

The Gap messages indicate errors/noise in your ingest.  The number of packets
skipped with each Gap messages provides an indirect measure of how bad the
noise problem is.  An example of what I mean by this last comment is: during
periods of Solar Interference, you will experience Gap messages, and the number
of packets skipped will be large.

In order to monitor our NOAAPort ingest systems, I wrote a simple Bourne
shell script that goes through daily log files and counts the number of
Gap messages; sums the skipped values into a total number of packets
missed, and does some very basic binning to show at a glance if the
Gaps occur in "clumps".  I run the process at the beginning of a 
new day for 'yesterday' and save the result into a log file so that
the time history of errors can be easily seen.  I implemented this
script on all of the NOAAPort ingest machines that feed into the IDD,
The next thing I did was write a script that goes out to each of the
NOAAPort ingest machines that feed into the IDD and lists the last 'N'
days of summary error stats.  One of my morning routines done while
drinking coffee at home each morning is listing out the last 7 days
of errors on each of the NOAAPort ingest machines.  Here is an example
of what the output looks like for the previous 7 days:

mistral.srcc.lsu.edu
mistral:: 20190325.235502: nGap:    166 nFrame:        851 nGsec:    21 nGmin:  
 16
mistral:: 20190326.232502: nGap:    534 nFrame:       3347 nGsec:   108 nGmin:  
 42
mistral:: 20190327.232051: nGap:     30 nFrame:         80 nGsec:    15 nGmin:  
 15
mistral:: 20190328.204329: nGap:     11 nFrame:         32 nGsec:    10 nGmin:  
  9
mistral:: 20190329.235501: nGap:    245 nFrame:       1491 nGsec:    24 nGmin:  
 17
mistral:: 20190330.235631: nGap:     18 nFrame:         50 nGsec:    12 nGmin:  
 12
mistral:: 20190331.235702: nGap:     27 nFrame:         74 nGsec:    20 nGmin:  
 19

np1.ssec.wisc.edu
    np1:: 20190325.225602: nGap:   2579 nFrame:      22541 nGsec:   826 nGmin:  
654
    np1:: 20190326.222748: nGap:   1103 nFrame:      13594 nGsec:   577 nGmin:  
488
    np1:: 20190327.221358: nGap:   2722 nFrame:      26383 nGsec:   636 nGmin:  
488
    np1:: 20190328.213101: nGap:   1800 nFrame:      16717 nGsec:   597 nGmin:  
507
    np1:: 20190329.223643: nGap:   2633 nFrame:      20683 nGsec:   813 nGmin:  
635
    np1:: 20190330.225230: nGap:   1456 nFrame:      10554 nGsec:   780 nGmin:  
672
    np1:: 20190331.212322: nGap:   2180 nFrame:      26199 nGsec:   785 nGmin:  
633

np2.ssec.wisc.edu
    np2:: 20190325.192001: nGap:   2479 nFrame:      21966 nGsec:   773 nGmin:  
594
    np2:: 20190326.203102: nGap:   1371 nFrame:      11604 nGsec:   634 nGmin:  
543
    np2:: 20190327.221358: nGap:   2957 nFrame:      24884 nGsec:   712 nGmin:  
553
    np2:: 20190328.162752: nGap:   1968 nFrame:      17096 nGsec:   709 nGmin:  
615
    np2:: 20190329.223643: nGap:   2680 nFrame:      32678 nGsec:   830 nGmin:  
659
    np2:: 20190330.225304: nGap:   1375 nFrame:       9645 nGsec:   696 nGmin:  
593
    np2:: 20190331.212322: nGap:   2355 nFrame:      20368 nGsec:   833 nGmin:  
652

leno.unidata.ucar.edu
   leno:: 20190325.191345: nGap:    210 nFrame:       1230 nGsec:    18 nGmin:  
 12
   leno:: 20190326.225659: nGap:     13 nFrame:         32 nGsec:     9 nGmin:  
  9
   leno:: 20190327.145654: nGap:     53 nFrame:        476 nGsec:     8 nGmin:  
  4
   leno:: 20190328.202102: nGap:    240 nFrame:       3071 nGsec:    15 nGmin:  
 10
   leno:: 20190329.224302: nGap:      8 nFrame:         23 nGsec:     7 nGmin:  
  6
   leno:: 20190330.234601: nGap:     34 nFrame:         79 nGsec:    11 nGmin:  
 10
   leno:: 20190331.215902: nGap:    452 nFrame:       3105 nGsec:    20 nGmin:  
 13

chico.unidata.ucar.edu
  chico:: 20190325.191345: nGap:    212 nFrame:       1362 nGsec:    18 nGmin:  
 12
  chico:: 20190326.225659: nGap:      9 nFrame:         21 nGsec:     9 nGmin:  
  9
  chico:: 20190327.194501: nGap:     62 nFrame:        506 nGsec:    15 nGmin:  
 11
  chico:: 20190328.204329: nGap:    239 nFrame:       3187 nGsec:    15 nGmin:  
  9
  chico:: 20190329.204247: nGap:      9 nFrame:         17 nGsec:     7 nGmin:  
  6
  chico:: 20190330.192400: nGap:     39 nFrame:        104 nGsec:     5 nGmin:  
  4
  chico:: 20190331.170103: nGap:    495 nFrame:       3533 nGsec:    34 nGmin:  
 17

Thinking back, I seem to recall mentioning the need for log file rotation, and
a general, high level overview of how we monitor NOAAPort ingest stats.

re:
> Also,
> 
> What is the easiest way to change the location of my log files?  My ldm home 
> is
> /usr/local/ldm and they are being written to /usr/local/ldm/var/logs.
> 
> 
> I only have 50G in my root space, but have 500G available in /home.  I would 
> like
> to have them written to /home where they shouldn't fill the disk.

The logging facility in newer implementations of the LDM allow one to
easily specify where to put log files, including NOAAPort ingest
log files.  For example, here is the portion of our NOAAPort LDM
configuration file (~ldm/etc/ldmd.conf) that shows how we are doing
our NOAAPort ingest:

# 20170313 - changed set of noaaportIngester instances to match:
#            
http://www.nws.noaa.gov/noaaport/document/Multicast%20Addresses%201.0.pdf
#            CHANNEL PID MULTICAST ADDRESS Port DETAILS
#            NMC     101     224.0.1.1     1201 NCEP / NWSTG
#            GOES    102     224.0.1.2     1202 GOES / NESDIS
#            NMC2    103     224.0.1.3     1203 NCEP / NWSTG2
#            NOPT    104     224.0.1.4     1204 Optional Data - OCONUS Imagery 
/ Model
#            NPP     105     224.0.1.5     1205 National Polar-Orbiting 
Partnership / POLARSAT
#            EXP     106     224.0.1.8     1208 Experimental
#            GRW     107     224.0.1.9     1209 GOES-R Series West
#            GRE     108     224.0.1.10    1210 GOES-R Series East
#            NWWS    201     224.1.1.1     1201 Weather Wire
#
exec    "noaaportIngester -n -m 224.0.1.1  -l /data/tmp/nwstg.log"
exec    "noaaportIngester -n -m 224.0.1.2  -l /data/tmp/goes.log"
exec    "noaaportIngester -n -m 224.0.1.3  -l /data/tmp/nwstg2.log"
exec    "noaaportIngester -n -m 224.0.1.4  -l /data/tmp/oconus.log"
exec    "noaaportIngester -n -m 224.0.1.5  -l /data/tmp/nother.log"
exec    "noaaportIngester -n -m 224.0.1.8  -l /data/tmp/nother.log"
exec    "noaaportIngester -n -m 224.0.1.9  -l /data/tmp/nother.log"
exec    "noaaportIngester -n -m 224.0.1.10 -l /data/tmp/nother.log"

As you can see, we are putting our log files in the /data/tmp directory.
We do this since /data is where we have the most disk space on our
NOAAPort ingest machines.

We also rotate our NOAAPort ingest log files via the following
cron entry:

#
# Rotate NOAAPort ingest logs
#
0 0 * * * bin/nplog_rotate 30 > /dev/null 2>&1

We can use the copy of 'nplog_rotate' that is included in the
LDM distribution (~ldm/bin/nplog_rotate) since the directory
that it expects to find the NOAAPort log files in is /data/tmp.
Here is the default script:

#!/bin/csh -f
#
# This is the log directory defined in syslog.conf
cd /data/tmp

set LOGS=("nwstg.log" "goes.log" "nwstg2.log" "oconus.log" "nother.log")

if ( $#argv > 0 ) then
  set KEEP_LOG=$1
else
  set KEEP_LOG=14
endif

foreach LOG ($LOGS)
   echo rotate $LOG
   ~ldm/bin/newlog ./$LOG $KEEP_LOG
end

sh ~ldm/bin/refresh_logging
# The following can't hurt and will accommodate old, unmodified 
# noaaportIngester(1) EXEC-entries that use the "-u" option.
~/ldm/bin/hupsyslog


In your case, I would modify the above by:

- copy ~ldm/bin/nplog_rotate to the ~ldm/util directory

- modify ~ldm/util/nplog_rotate and change the directory
  to match where you will be writing your NOAAPort ingest
  log files

- modify your LDM configuration file to parallel ours while
  changing the log directory to match what you want

  Make sure that the directory location in your ~ldm/util/nplog_rotate
  matches the directory you setup for NOAAPort ingest logs in
  your LDM configuration file!

re:
> Thank you,

No worries.

Again, if you are interested in setting up gathering of daily NOAAPort
summary stats, I can send you the scripts that we use.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: QWE-565829
Department: Support NOAAPORT
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.