Re: [ldm-users] [noaaport] Assistance requested for "Gap in packet sequence" log entries from noaaportIngester

  • To: Gregory Grosshans <gregory.grosshans@xxxxxxxx>
  • Subject: Re: [ldm-users] [noaaport] Assistance requested for "Gap in packet sequence" log entries from noaaportIngester
  • From: Daniel Vietor - NOAA Affiliate <dan.vietor@xxxxxxxx>
  • Date: Wed, 17 Jun 2020 13:20:30 -0500
We use to get a lot of lost packets if these kernel parameters were set:

sysctl -w net.ipv4.ipfrag_max_dist=4096
sysctl -w net.ipv4.conf.default.rp_filter=2

This is in the LDM NOAAPort documentation.

Dan.


On Wed, Jun 17, 2020 at 12:35 PM Gregory Grosshans via noaaport <
noaaport@xxxxxxxxxxxxxxxx> wrote:

> We are replacing legacy SBN ingest software and spinning up the Unidata
> noaaportIngester (i.e. LDM version 6.13.11) on RHEL7 / RHEL6.
> Unfortunately, we continue to receive "Gap in packet sequence" entries in
> various log files (e.g. nwstg, nwstg2, nother, goes and polarsat), and in
> particular the polarsat.log file has many of these entries.  Can you please
> review the information below, ask clarifying questions, and hopefully offer
> suggestions on how to determine the cause of the Gap entries and steps to
> eliminate them?
>
> Thank you for your time,
> Gregg
>
>
> *The NOVRA firmware being used is: *
>
> *V2R15*
>
>
> *LDM Version and Server Information:*
>
> *6GB tmpfs for LDM (6.13.11) queue on a Dell R410, 64 GB, dual Intel Xeon
> X5667 @3.07GHz, RHEL7.8*
> *2GB tmpfs for LDM (6.13.6)  queue on a Dell R410, 32 GB, dual Intel Xeon
> X5667 @3.07GHz, RHEL6.10  *
>
>
>
> The SBN Dish is a typical NWS SBN dish, feeding a splitter with multiple
> NOVRA boxes on the other side of the splitter.
>
> I've checked with another National Center and they are not receiving these
> Gap entries, and we have tried their noaaportIngester executable on our SPC
> system and still receive the Gap entries.
>
> We have tried multiple NOVRA boxes on the RHEL7 server and continue to
> receive Gap entries.  These same NOVRA boxes work on other systems at SPC
> with no issues.  Different NOVRA boxes have been tried with no success
> (i.e. in terms of having no Gap entries in the log files).
>
> There are very infrequent instances of Gaps in packets appearing on other
> systems at the same time, for example on the AWIPS cpsbn1 server in the LDM
> noaaportIngester log file.  Thus indicating the Gap is in multiple systems,
> using different NOVRA boxes and connections, perhaps on the SBN uplink, or
> perhaps downlink if there is local weather (e.g. perhaps LTG) causing
> interference.
>
>
> I've worked with several of my IT colleagues at SPC and we have eliminated
> the possibility of the Gaps in packets as a result of the NOVRA box, and
> connections between pre/post NOVRA box.  This leads us to believe the
> errors are a result of something on the RHEL6 and RHEL7 servers, to
> possibly include the noaaportIntester.
>
>
> noaaportIngester is being invoked via LDM with the following entries from
> ldmd.conf:
>
>
> EXEC    "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.1  -n -c -u 3 -r 1 -s
> NMC"
>
> EXEC    "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.2  -n -c -u 4 -r 1 -s
> GOES -f"
>
> EXEC    "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.3  -n -c -u 5 -r 1 -s
> NMC2"
>
> EXEC    "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.4  -n -c -u 6 -r 1 -s
> NOAAPORT_OPT"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.5  -n -c -u 7 -r 1 -s
> NMC3"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.6  -n -c -u 4 -r 1 -s
> ADD"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.7  -n -c -u 7 -r 1 -s
> ENC"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.8  -n -c -u 7 -r 1 -s
> EXP"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.9  -n -c -u 4 -r 1 -s
> GRW"
>
> EXEC "*noaaport*Ingester -I 10.0.5.50 -m 224.0.1.10 -n -c -u 4 -r 1 -s
> GRE"
>
>
>
> LDM settings:
>
> [ldmcp@sbn1 ~]$ regutil
>
> /delete-info-files : 0
>
> /hostname : sbn1.spc.noaa.gov
>
> /insertion-check-interval : 300
>
> /oess-pathname : /home/ldmcp/etc/OESS-account.yaml
>
> /reconciliation-mode : do nothing
>
> /check-time/enabled : 1
>
> /check-time/limit : 10
>
> /check-time/warn-if-disabled : 1
>
> /check-time/ntpdate/command : /usr/sbin/ntpdate
>
> /check-time/ntpdate/servers : ntp.spc.noaa.gov ntp1.spc.noaa.gov
> ntp2.spc.noaa.gov
>
> /check-time/ntpdate/timeout : 5
>
> /metrics/count : 4
>
> /metrics/file : /home/ldmcp/logs/metrics.txt
>
> /metrics/files : /home/ldmcp/logs/metrics.txt*
>
> /metrics/netstat-command : /bin/netstat -A inet -t -n
>
> /metrics/top-command : /bin/top -b -n 1
>
> /log/count : 7
>
> /log/file : /home/ldmcp/logs/ldmd.log
>
> /log/rotate : 1
>
> /pqsurf/config-path : /home/ldmcp/etc/pqsurf.conf
>
> /pqsurf/datadir-path : /home/ldmcp/data
>
> /scour/config-path : /home/ldmcp/etc/scour.conf
>
> /surf-queue/path : /home/ldmcp/queues/pqsurf.pq
>
> /surf-queue/size : 2M
>
> /server/config-path : /home/ldmcp/etc/ldmd.conf
>
> /server/enable-anti-DOS : TRUE
>
> /server/ip-addr : 0.0.0.0
>
> /server/max-clients : 256
>
> /server/max-latency : 3600
>
> /server/port : 388
>
> /server/time-offset : 3600
>
> /queue/path : /ldmcp/data/queues/ldm.pq
>
> /queue/size : 6000M
>
> /queue/slots : default
>
> /pqact/config-path : /home/ldmcp/etc/pqact.conf
>
> /pqact/datadir-path : /home/ldmcp/data/data
>
> [ldmcp@sbn1 ~]$
>
>
>
> Gap entries in various log files starting with new logs starting at ~1537Z
> (from Monday June 15):
>
>
> [ldmcp@sbn1 ~/logs]$ grep Gap *log | more
>
> goes.log:Jun 15 15:37:55 sbn1 noaaportIngester[3457]:
> productMaker.c:439:pmStart() Gap in packet sequence: 513870098 to 514102655
> [skipped 232556]
>
> goes.log:Jun 15 15:38:01 sbn1 noaaportIngester[3456]:
> productMaker.c:439:pmStart() Gap in packet sequence: 551431874 to 551639014
> [skipped 207139]
>
> goes.log:Jun 15 15:38:08 sbn1 noaaportIngester[3449]:
> productMaker.c:439:pmStart() Gap in packet sequence: 72489 to 72516
> [skipped 26]
>
> goes.log:Jun 15 15:38:08 sbn1 noaaportIngester[3453]:
> productMaker.c:439:pmStart() Gap in packet sequence: 72489 to 72516
> [skipped 26]
>
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469313177 to
> 1469734315 [skipped 421137]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734334 to
> 1469734336 [skipped 1]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734343 to
> 1469734345 [skipped 1]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734353 to
> 1469734355 [skipped 1]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734355 to
> 1469734361 [skipped 5]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734362 to
> 1469734367 [skipped 4]
>
> nwstg2.log:Jun 15 15:37:53 sbn1 noaaportIngester[3450]:
> productMaker.c:439:pmStart() Gap in packet sequence: 1469734367 to
> 1469734377 [skipped 9]
>
>
> nwstg.log:Jun 15 15:37:54 sbn1 noaaportIngester[3448]:
> productMaker.c:439:pmStart() Gap in packet sequence: 560043371 to 560207084
> [skipped 163712]
>
>
> polarsat.log:Jun 15 15:37:56 sbn1 noaaportIngester[3452]:
> productMaker.c:439:pmStart() Gap in packet sequence: 51060086 to 51060113
> [skipped 26]
>
> polarsat.log:Jun 15 15:37:58 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175690361 to 175761926
> [skipped 71564]
>
> polarsat.log:Jun 15 15:38:07 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175761926 to 175761990
> [skipped 63]
>
> polarsat.log:Jun 15 15:38:11 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175761990 to 175761992
> [skipped 1]
>
> polarsat.log:Jun 15 15:38:41 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175761992 to 175762207
> [skipped 214]
>
> polarsat.log:Jun 15 15:38:44 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175762207 to 175762269
> [skipped 61]
>
> polarsat.log:Jun 15 15:39:11 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175762269 to 175762624
> [skipped 354]
>
> polarsat.log:Jun 15 15:39:17 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175762624 to 175762626
> [skipped 1]
>
> polarsat.log:Jun 15 15:39:47 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175762626 to 175763154
> [skipped 527]
>
> polarsat.log:Jun 15 15:39:50 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175763154 to 175763349
> [skipped 194]
>
> polarsat.log:Jun 15 15:39:50 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175763349 to 175763351
> [skipped 1]
>
> polarsat.log:Jun 15 15:40:17 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175763351 to 175763945
> [skipped 593]
>
> polarsat.log:Jun 15 15:40:19 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175763945 to 175763947
> [skipped 1]
>
> polarsat.log:Jun 15 15:40:30 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175763947 to 175764006
> [skipped 58]
>
> polarsat.log:Jun 15 15:40:30 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175764006 to 175764037
> [skipped 30]
>
> polarsat.log:Jun 15 15:40:38 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175764037 to 175764251
> [skipped 213]
>
> polarsat.log:Jun 15 15:40:49 sbn1 noaaportIngester[3455]:
> productMaker.c:439:pmStart() Gap in packet sequence: 175764251 to 175764283
> [skipped 31]
>
>
> ... 2200+ lines of polarsat Gap entries not listed and the last few at
> ~2140Z:
>
>
> Jun 15 21:39:05 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177780466 to 177780804 [skipped 337]
>
> Jun 15 21:39:18 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177780804 to 177780838 [skipped 33]
>
> Jun 15 21:39:22 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177780838 to 177780976 [skipped 137]
>
> Jun 15 21:39:31 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177780976 to 177781214 [skipped 237]
>
> Jun 15 21:39:31 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177781214 to 177781260 [skipped 45]
>
> Jun 15 21:39:45 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177781260 to 177781537 [skipped 276]
>
> Jun 15 21:39:45 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177781537 to 177781629 [skipped 91]
>
> Jun 15 21:39:52 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177781630 to 177781632 [skipped 1]
>
> Jun 15 21:40:05 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177781632 to 177782030 [skipped 397]
>
> Jun 15 21:40:09 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177782030 to 177783125 [skipped 1094]
>
> Jun 15 21:40:09 sbn1 noaaportIngester[3455]: productMaker.c:439:pmStart()
> *Gap* in packet sequence: 177783125 to 177783202 [skipped 76]
>
> [ldmcp@sbn1 ~/logs]$ !! |wc
>
> grep Gap po*log | wc
>
>    *2254*   33810  302849
>
> [ldmcp@sbn1 ~/logs]$
>
>
>
> Have you seen Gaps in packet log entries in the past and if so how did you
> mitigate them and what do you suggest to eliminate these extraneous entries?
>
> --
> *=====================================================================*
>
> *Email seems to be generating increasing inefficiencies in organizations.
> I learned from a manager a Stanford Computer Science professor no longer
> uses email for communication, but uses SNAIL mail, telephone calls, and
> person to person visits.  I'm considering the same.  *
> *Storm Prediction Center*
>
> *120 David L. Boren Blvd, Suite 2330Norman, OK 73072*
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> noaaport mailing list
> noaaport@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe, visit:
> https://www.unidata.ucar.edu/mailing_lists/
>


-- 
*Dan Vietor*
*Senior Research Meteorologist*
CIRA, Colorado State Univ
Aviation Weather Center
Kansas City, MO
816.584.7211
  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: