[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[NOAAPORT #ZLV-851048]: LDM misses data on NOAAPORT via satellite
- Subject: [NOAAPORT #ZLV-851048]: LDM misses data on NOAAPORT via satellite
- Date: Fri, 04 Dec 2020 14:59:04 -0700
Hi,
re:
> Resending since your server blocked my email because of the attached
> zip file.
It must have been unusually large since we routinely receive emails with
attachments that are > 10 MB.
re:
> I cannot add the one ldmd.log file because of size.
OK. No worries, I think that what you have provided below is enough to
zero in on the problem...
re:
> I will paste a general idea of what I am seeing at the end of my previous
> reply.
re:
> Sorry for the delay. I wanted to give time for the changes I made to see for
> sure if anything changed.
OK, good move!
re:
> I changed the UDP buffer setting up to 25MB for both in /etc/sysctl.conf:
>
> net.core.rmem_max
> net.core.rmem_default
>
> In reference to the error connecting to rtstatat.unidata.ucar.edu yes, that
> was
> a typo and I meant to put rtstat. I commented that line out of the ldmd.conf
> and the error is now gone.
Very good.
re:
> I added the cron tasks and even though the one computer gives an error message
> it does produce the metrics.txt file.
It issues an error message when attempting to write the metrics.txt file? I
have never see this before. Can you send the error message that is isued?
re:
> The other day it was giving the same error message as when the crontab would
> run.
> Now I am not getting any error message so this seems to have fixed itself
> or the rebooting the boxes after the sysctl.conf cleaned something up that
> got rid of the error.
OK.
re:
> But with all of these changes I am still not seeing any difference. I am
> missing a lot of the NBM text data on the Z820 compared to the slower Z400
> machine.
NOAA has been running a test where the NBM files are being sent on
port 1206, multicast address 224.0.1.6. A quick review of the LDM
configuration files that you sent for both of your machines showed
that you are not processing data on this port. For reference, here
are the LDM configuration EXEC lines that we using:
EXEC "keep_running noaaportIngester -n -m 224.0.1.1 -l /data/tmp/nwstg.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.2 -l /data/tmp/goes.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.3 -l /data/tmp/nwstg2.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.4 -l /data/tmp/oconus.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.5 -l
/data/tmp/polar-orbiter.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.6 -l /data/tmp/nbm.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.8 -l
/data/tmp/experimental.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.9 -l
/data/tmp/goes-west.log"
EXEC "keep_running noaaportIngester -n -m 224.0.1.10 -l
/data/tmp/goes-east.log"
Comments on our actions:
- 'keep_running' is a simple shell script that will start the process passed
as the first argument, and restart that process if it exits
This script is included in the newest LDM releases like v6.13.13.
- we are logging the receipt of every product into channel specific log
files
We mine these log files to keep track of the number of Gaps and associated
missed frames that we are being experienced.
re:
> I am attaching the metrics files from both computers along with the ldmd.log
> from both.
The metrics.txt files show that both of your machines are basically idling
(i.e., the load averages are very low), so system overload is clearly not
an issue.
re:
> I did notice the ldmd.log file from the Z400 is 22 MB in size. It has a
> lot of errors that the other machine does not have. I am not sure if this
> is from the slightly different versions of software, but I also wonder if
> since it is getting errors on those grib files does it spend less time on
> those (even though I do not save them anyway) so it has more processing
> time for the files I do save??? Just throwing out random ideas..
>
>
> --- ldmd.log from Z400 example---
>
> 20201204T173026.868109Z noaaportIngester[24566]
> grib2name.c:grib2name() ERROR Couldn't decode GRIB2 message. WMO
> header="YAUL02 KWNR 041726"
> 20201204T173212.366265Z noaaportIngester[24566]
> gb2param.c:gb2_param() WARN Couldn't get parameter info: iver=255, disc=209,
> cat=2, id=5, pdtn=0, center=nssl, lclver=1, file=g2varsnssl1.tbl
> 20201204T173212.366319Z noaaportIngester[24566]
> gb22gem.c:gb2_2gem() ERROR [GB 1] Couldn't get parameter values
> 20201204T173212.366368Z noaaportIngester[24566]
> grib2name.c:grib2name() ERROR Couldn't decode GRIB2 message. WMO
> header="YAUL02 KWNR 041728"
> 20201204T173413.283900Z noaaportIngester[24566]
> gb2param.c:gb2_param() WARN Couldn't get parameter info: iver=255, disc=209,
> cat=2, id=5, pdtn=0, center=nssl, lclver=1, file=g2varsnssl1.tbl
> 20201204T173413.283951Z noaaportIngester[24566]
> gb22gem.c:gb2_2gem() ERROR [GB 1] Couldn't get parameter values
> 20201204T173413.283982Z noaaportIngester[24566]
> grib2name.c:grib2name() ERROR Couldn't decode GRIB2 message. WMO
> header="YAUL02 KWNR 041730"
> 20201204T173613.145506Z noaaportIngester[24566]
> gb2param.c:gb2_param() WARN Couldn't get parameter info: iver=255, disc=209,
> cat=2, id=5, pdtn=0, center=nssl, lclver=1, file=g2varsnssl1.tbl
> 20201204T173613.145558Z noaaportIngester[24566]
> gb22gem.c:gb2_2gem() ERROR [GB 1] Couldn't get parameter values
> 20201204T173613.145589Z noaaportIngester[24566]
> grib2name.c:grib2name() ERROR Couldn't decode GRIB2 message. WMO
> header="YAUL02 KWNR 041732"
> 20201204T173759.517353Z noaaportIngester[24566]
> gb2param.c:gb2_param() WARN Couldn't get parameter info: iver=255, disc=209,
> cat=2, id=5, pdtn=0, center=nssl, lclver=1, file=g2varsnssl1.tbl
> 20201204T173759.517406Z noaaportIngester[24566]
> gb22gem.c:gb2_2gem() ERROR [GB 1] Couldn't get parameter values
> 20201204T173759.517446Z noaaportIngester[24566]
> grib2name.c:grib2name() ERROR Couldn't decode GRIB2 message. WMO
> header="YAUL02 KWNR 041734"
> 20201204T174000.876995Z noaaportIngester[24566]
> gb2param.c:gb2_param() WARN Couldn't get parameter info: iver=255, disc=209,
> cat=2, id=5, pdtn=0, center=nssl, lclver=1, file=g2varsnssl1.tbl
...
The 'ERROR Couldn't decode GRIB2 message' messages are caused from GRIB2
definitions
not being in the various GRIB2 ('*.tbl') files that get installed in the
~ldm/etc
directory. When the information for the GRIB2 field are not found, a full
Product
ID can not be created. This does not, however, affect the product being
processed
into the LDM queue, all products are inserted into the queue unless they are
corrupt.
It will affect the processing of products IF the pattern-action file action(s)
are written to match Product IDs.
FYI: the GRIB2 table files that I am referring to in the previous paragraph are
updated with each new LDM release, and up to date versions can be download
from Github.
After seeing these messages, it is my belief that the cause of the difference
on your machines is different LDM versions being run, and the reason that
the processing is different is the machines are using different (and out of
date) sets of GRIB2 tables.
I think that the simplest thing for you to do at this point is upgrade
to the latest version of the LDM, v6.13.13 on both of your machines,
as that will have GRIB2 tables that are quite a bit newer and more
complete than the ones you are currently using. I also suggest keeping
up to date with the LDM release on both machines in lockstep.
Cheers,
Tom
--
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
address@hidden Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: ZLV-851048
Department: Support NOAAPORT
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata
inquiry tracking system and then made publicly available through the web. If
you do not want to have your interactions made available in this way, you must
let us know in each email you send to us.