[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [noaaport] Assistance requested for "Gap in packet sequence" log entries from noaaportIngester



Good afternoon Gregg,

Do you use the Linux CMCS as part of the Novra diagnostic toolset? If you do (and you should!), log in and on the command line, tell me what is the output of these two commands:

show sat
show lnb

Also, did you do the preinstallation steps found in the section at:

https://unidata.ucar.edu/software/ldm/ldm-current/utilities/noaaport/index.html

Ignore the "configuration" section on that page, it needs to be updated, and I have made UNIDATA aware of that. The logging and performance considerations section are also correct on that page.

Also, upgrade that LDM using version 6.13.6 to at least 6.13.11. Additionally, there is a serious NOAAport ingest bug that was fixed in the 6.13.12.42 beta that crashes the LDM when garbled grib data (caused by a garbled file or poor SBN reception) is received. College of DuPage, AllisonHouse, WISC and others have been using it with no issues. it's available at:

ftp://ftp.unidata.ucar.edu/pub/ldm/beta 

I would recommend having it on just one of the two servers, not both, with the other having LDM 6.13.11, the only current version UNIDATA officially supports.

Gilbert

On Jun 17, 2020, at 12:34 PM, Gregory Grosshans via noaaport <address@hidden> wrote:


We are replacing legacy SBN ingest software and spinning up the Unidata noaaportIngester (i.e. LDM version 6.13.11) on RHEL7 / RHEL6.  Unfortunately, we continue to receive "Gap in packet sequence" entries in various log files (e.g. nwstg, nwstg2, nother, goes and polarsat), and in particular the polarsat.log file has many of these entries.  Can you please review the information below, ask clarifying questions, and hopefully offer suggestions on how to determine the cause of the Gap entries and steps to eliminate them?

Thank you for your time,
Gregg


The NOVRA firmware being used is: 
V2R15

LDM Version and Server Information:
6GB tmpfs for LDM (6.13.11) queue on a Dell R410, 64 GB, dual Intel Xeon X5667 @3.07GHz, RHEL7.8
2GB tmpfs for LDM (6.13.6)  queue on a Dell R410, 32 GB, dual Intel Xeon X5667 @3.07GHz, RHEL6.10  


The SBN Dish is a typical NWS SBN dish, feeding a splitter with multiple NOVRA boxes on the other side of the splitter.  

I've checked with another National Center and they are not receiving these Gap entries, and we have tried their noaaportIngester executable on our SPC system and still receive the Gap entries.

We have tried multiple NOVRA boxes on the RHEL7 server and continue to receive Gap entries.  These same NOVRA boxes work on other systems at SPC with no issues.  Different NOVRA boxes have been tried with no success (i.e. in terms of having no Gap entries in the log files).  

There are very infrequent instances of Gaps in packets appearing on other systems at the same time, for example on the AWIPS cpsbn1 server in the LDM noaaportIngester log file.  Thus indicating the Gap is in multiple systems, using different NOVRA boxes and connections, perhaps on the SBN uplink, or perhaps downlink if there is local weather (e.g. perhaps LTG) causing interference.


I've worked with several of my IT colleagues at SPC and we have eliminated the possibility of the Gaps in packets as a result of the NOVRA box, and connections between pre/post NOVRA box.  This leads us to believe the errors are a result of something on the RHEL6 and RHEL7 servers, to possibly include the noaaportIntester.


noaaportIngester is being invoked via LDM with the following entries from ldmd.conf:


EXEC    "noaaportIngester -I 10.0.5.50 -m 224.0.1.1  -n -c -u 3 -r 1 -s NMC"

EXEC    "noaaportIngester -I 10.0.5.50 -m 224.0.1.2  -n -c -u 4 -r 1 -s GOES -f"

EXEC    "noaaportIngester -I 10.0.5.50 -m 224.0.1.3  -n -c -u 5 -r 1 -s NMC2"

EXEC    "noaaportIngester -I 10.0.5.50 -m 224.0.1.4  -n -c -u 6 -r 1 -s NOAAPORT_OPT"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.5  -n -c -u 7 -r 1 -s NMC3"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.6  -n -c -u 4 -r 1 -s ADD"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.7  -n -c -u 7 -r 1 -s ENC"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.8  -n -c -u 7 -r 1 -s EXP"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.9  -n -c -u 4 -r 1 -s GRW"

EXEC "noaaportIngester -I 10.0.5.50 -m 224.0.1.10 -n -c -u 4 -r 1 -s GRE"



LDM settings:

[ldmcp@sbn1 ~]$ regutil

/delete-info-files : 0

/hostname : sbn1.spc.noaa.gov

/insertion-check-interval : 300

/oess-pathname : /home/ldmcp/etc/OESS-account.yaml

/reconciliation-mode : do nothing

/check-time/enabled : 1

/check-time/limit : 10

/check-time/warn-if-disabled : 1

/check-time/ntpdate/command : /usr/sbin/ntpdate

/check-time/ntpdate/servers : ntp.spc.noaa.gov ntp1.spc.noaa.gov ntp2.spc.noaa.gov

/check-time/ntpdate/timeout : 5

/metrics/count : 4

/metrics/file : /home/ldmcp/logs/metrics.txt

/metrics/files : /home/ldmcp/logs/metrics.txt*

/metrics/netstat-command : /bin/netstat -A inet -t -n

/metrics/top-command : /bin/top -b -n 1

/log/count : 7

/log/file : /home/ldmcp/logs/ldmd.log

/log/rotate : 1

/pqsurf/config-path : /home/ldmcp/etc/pqsurf.conf

/pqsurf/datadir-path : /home/ldmcp/data

/scour/config-path : /home/ldmcp/etc/scour.conf

/surf-queue/path : /home/ldmcp/queues/pqsurf.pq

/surf-queue/size : 2M

/server/config-path : /home/ldmcp/etc/ldmd.conf

/server/enable-anti-DOS : TRUE

/server/ip-addr : 0.0.0.0

/server/max-clients : 256

/server/max-latency : 3600

/server/port : 388

/server/time-offset : 3600

/queue/path : /ldmcp/data/queues/ldm.pq

/queue/size : 6000M

/queue/slots : default

/pqact/config-path : /home/ldmcp/etc/pqact.conf

/pqact/datadir-path : /home/ldmcp/data/data

[ldmcp@sbn1 ~]$ 



Gap entries in various log files starting with new logs starting at ~1537Z (from Monday June 15):


[ldmcp@sbn1 ~/logs]$ grep Gap *log | more

goes.log:Jun 15 15:37:55 sbn1 noaaportIngester[3457]: productMaker.c:439:pmStart() Gap in packet sequence: 513870098 to 514102655 [skipped 232556]

goes.log:Jun 15 15:38:01 sbn1 noaaportIngester[3456]: productMaker.c:439:pmStart() Gap in packet sequence: 551431874 to 551639014 [skipped 207139]

goes.log:Jun 15 15:38:08 sbn1 noaaportIngester[3449]: productMaker.c:439:pmStart() Gap in packet sequence: 72489 to 72516 [skipped 26]

goes.log:Jun 15 15:38:08 sbn1 noaaportIngester[3453]: productMaker.c:439:pmStart() Gap in packet sequence: 72489 to 72516 [skipped 26]


nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469313177 to 1469734315 [skipped 421137]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734334 to 1469734336 [skipped 1]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734343 to 1469734345 [skipped 1]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734353 to 1469734355 [skipped 1]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734355 to 1469734361 [skipped 5]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734362 to 1469734367 [skipped 4]

nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]: productMaker.c:439:pmStart() Gap in packet sequence: 1469734367 to 1469734377 [skipped 9]


nwstg.log:Jun 15 15:37:54 sbn1 noaaportIngester[3448]: productMaker.c:439:pmStart() Gap in packet sequence: 560043371 to 560207084 [skipped 163712]


polarsat.log:Jun 15 15:37:56 sbn1 noaaportIngester[3452]: productMaker.c:439:pmStart() Gap in packet sequence: 51060086 to 51060113 [skipped 26]

polarsat.log:Jun 15 15:37:58 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175690361 to 175761926 [skipped 71564]

polarsat.log:Jun 15 15:38:07 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175761926 to 175761990 [skipped 63]

polarsat.log:Jun 15 15:38:11 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175761990 to 175761992 [skipped 1]

polarsat.log:Jun 15 15:38:41 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175761992 to 175762207 [skipped 214]

polarsat.log:Jun 15 15:38:44 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175762207 to 175762269 [skipped 61]

polarsat.log:Jun 15 15:39:11 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175762269 to 175762624 [skipped 354]

polarsat.log:Jun 15 15:39:17 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175762624 to 175762626 [skipped 1]

polarsat.log:Jun 15 15:39:47 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175762626 to 175763154 [skipped 527]

polarsat.log:Jun 15 15:39:50 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175763154 to 175763349 [skipped 194]

polarsat.log:Jun 15 15:39:50 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175763349 to 175763351 [skipped 1]

polarsat.log:Jun 15 15:40:17 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175763351 to 175763945 [skipped 593]

polarsat.log:Jun 15 15:40:19 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175763945 to 175763947 [skipped 1]

polarsat.log:Jun 15 15:40:30 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175763947 to 175764006 [skipped 58]

polarsat.log:Jun 15 15:40:30 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175764006 to 175764037 [skipped 30]

polarsat.log:Jun 15 15:40:38 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175764037 to 175764251 [skipped 213]

polarsat.log:Jun 15 15:40:49 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 175764251 to 175764283 [skipped 31]


... 2200+ lines of polarsat Gap entries not listed and the last few at ~2140Z:


Jun 15 21:39:05 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177780466 to 177780804 [skipped 337]

Jun 15 21:39:18 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177780804 to 177780838 [skipped 33]

Jun 15 21:39:22 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177780838 to 177780976 [skipped 137]

Jun 15 21:39:31 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177780976 to 177781214 [skipped 237]

Jun 15 21:39:31 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177781214 to 177781260 [skipped 45]

Jun 15 21:39:45 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177781260 to 177781537 [skipped 276]

Jun 15 21:39:45 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177781537 to 177781629 [skipped 91]

Jun 15 21:39:52 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177781630 to 177781632 [skipped 1]

Jun 15 21:40:05 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177781632 to 177782030 [skipped 397]

Jun 15 21:40:09 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177782030 to 177783125 [skipped 1094]

Jun 15 21:40:09 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart() Gap in packet sequence: 177783125 to 177783202 [skipped 76]

[ldmcp@sbn1 ~/logs]$ !! |wc

grep Gap po*log | wc

   2254   33810  302849

[ldmcp@sbn1 ~/logs]$ 



Have you seen Gaps in packet log entries in the past and if so how did you mitigate them and what do you suggest to eliminate these extraneous entries?


--
=====================================================================
Email seems to be generating increasing inefficiencies in organizations.  I learned from a manager a Stanford Computer Science professor no longer uses email for communication, but uses SNAIL mail, telephone calls, and person to person visits.  I'm considering the same. 

Storm Prediction Center
120 David L. Boren Blvd, Suite 2330
Norman, OK 73072

_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


noaaport mailing list
address@hidden
For list information or to unsubscribe, visit: https://www.unidata.ucar.edu/mailing_lists/
_______________________________________________
NOTE: All exchanges posted to Unidata maintained email lists are
recorded in the Unidata inquiry tracking system and made publicly
available through the web.  Users who post to any of the lists we
maintain are reminded to remove any personal information that they
do not want to be made public.


noaaport mailing list
address@hidden
For list information or to unsubscribe, visit: 
https://www.unidata.ucar.edu/mailing_lists/