[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #HDQ-517625]: Assistance requested for "Gap in packet sequence" log entries from noaaportIngester



Hi Gregg,

re:
> Some more information.

Information is good!  :-)

re:
> First attached are noaaport log files for ~ the
> last 24 hours, with over 7000 Gap errors, most in polarsat, but 60 Gap
> errors in other log files.   *NOTE: NOAA email blocked the email going out
> because of the attachment.  Do you have a place I can ftp this file?*

I received the log tar file you included as an attachment to the email you
sent to me and Steve and CCed to Unidata Support.  Here is a listing of
that file's contents, please verify that this is the file you are taling
about:

-rw-r--r-- 1 mcidas Unidata    23772 Jun 18 10:23 polarsat.log.gz
-rw-r--r-- 1 mcidas Unidata   146075 Jun 18 10:23 goes.log.gz
-rw-r--r-- 1 mcidas Unidata   529866 Jun 18 10:23 nwstg2.log.gz
-rw-r--r-- 1 mcidas Unidata  1905601 Jun 18 10:23 nwstg.log.gz
-rw-r--r-- 1 mcidas Unidata   233656 Jun 18 10:23 oconus.log.gz
-rw-r--r-- 1 mcidas Unidata     3739 Jun 18 08:22 ldmd.log.gz
-rw-r--r-- 1 mcidas Unidata  3120885 Jun 18 08:22 nwstg2.log.1.gz
-rw-r--r-- 1 mcidas Unidata 14627476 Jun 18 08:22 nwstg.log.1.gz
-rw-r--r-- 1 mcidas Unidata  1960282 Jun 18 08:22 oconus.log.1.gz
-rw-r--r-- 1 mcidas Unidata   100296 Jun 18 08:22 polarsat.log.1.gz
-rw-r--r-- 1 mcidas Unidata  1047350 Jun 18 08:21 goes.log.1.gz
-rw-r--r-- 1 mcidas Unidata      742 Jun 17 14:35 ldmd.log.1.gz
-rw-r--r-- 1 mcidas Unidata       31 May 28 12:35 nother.log.gz

re:
> My coworker increased the satellite signal strength for all of the Novra's
> and now the output for the Novra in question is below, and after this
> change the Gap errors continued (i.e. the *.log files are those after the
> signal was tweaked and the log files rotated).
> 
> CMCS 10.0.5.10> show tuner
> 
> Satellite Interface Settings:
> 
> Receiver MAC Address: 00-06-76-05-00-d8
> Receiver Mode: DVBS2
> Frequency: 1110.0 MHz
> Symbol Rate: 30.000 Msps
> ModCod: 2/3 16PSK
> Gold code: 0
> Input Stream Filter: On
> Input Stream ID: 18
> Signal Lock: On
> Data Lock: On
> Uncorrectable Rate: 0/Second
> Packet Error Rate: 0.0000e+00
> Carrier to Noise C/N: 15.1dB
> Signal Strength: -34 dBm

Good, a signal strength of -34 dBm is right in what we found to be
the sweet spot for our setup.

Note, however, that your C/N did not change much if any.  We
observed the same lack of change in our setup here in Boulder.

Important aside:

We learned from from comments made by community users and our own
obserations that aligning the dish should be done to maximize the
C/N (in the negative sense, of course; -18 dBm is much better than
-15 dBm), not maximizing the signal strength.

re:
> You mentioned the way Unidata invokes noaaportIngester is different and
> listed what you ldmd.conf file has.

Correct.

re:
> Is it correct the only differences are the arguments?

The differences are the arguments, and this, in turn, indicates using
the newer logging facility, AND it reflects that we setup multicast
routing so that 'noaaportIngester' can read from the system default
multicast address.  This may or may not have any effect on the quality
of data ingest, but I wanted to let you know how we are doing things.

re:
> when I do show PIDS I get the following:
> 
> CMCS 10.0.5.10> show pids
> 
> MPE PIDs being processed: 101 102 103 104 105
> 106 107 108 151
> 
> PIDs being forwarded raw:
> 
> CMCS 10.0.5.10>


This is what we have also:

CMCS 192.168.1.20> show pids

        MPE PIDs being processed:       101     102     103     104     105
                                        106     107     108     150     151

        PIDs being forwarded raw:

re:
> Another coworker said the sysctl settings are similar to the ones suggested.

The biggest difference is in the DVBS multicast fragment reassembly.  We
use a value of '0' (zero), not 096, 8192, etc.  We do this because our
Novra S300N is directly connected to the Ethernet interface that we are
using for data ingestion.  This interface is also private, and the only
traffic on it is from the Novra S300N.

re:
> I'm curious about your comment, below, in regards to the additional log
> files I've sent?
> 
> > I note this because the number of missed frames in the 'Gap' messages you
> > included in your first email did _NOT_ match this pattern.  The 'Gap'
> > messages that you reported showed large numbers of missed frames, and this
> > is a telltale sign of a noisy feed or malfunctioning S300N receiver.

The Gap messages being logged on the three NOAA/GSL systems that we have
been able to see information from (the extracted Gap messages from the
intest log files are sent to us each night; we do not have logon access)
showed a _very_ interesting pattern of number of missed frames per Gap
message.  My immediate guess, and this was confirmed by experimentation,
was that this pattern was a result of something in the data path to the
ingest machines.  The pattern I am referring to is a predomination of
exactly 1 missed frame for each Gap message.  The Gap mesages you included
in your original email was showing a LOT of missed frames for each
Gap message, and this does NOT match the pattern that I was/am referring
to, so my conclusion was/is that your situation is different than the one
found in NOAA/GSL.

re:
> The information about VMs and NOAA/GSL data path problem are interesting.

When we were working with Raytheon, we were very surprised that the
RedHat hypervisor would be mucking up the UDP stream from (one of) their
Novra S300N.  A number of workarounds were tried, but the eventual 
solution was for Raytheon to get RedHat to fix their (RedHat's) problem.

re:
> SPC doesn't use VMs for the ingest and since the server is connected to the
> NOVRA with a short ethernet cable

Excellent.  If in addition you are using a separate Ethernet interface
for your NOAAPort ingestion and the output from your Novra S300N is
the only traffic seen on the interface, it should/will mean that you
can use the same DVBS multicast fragment reassembly setting that we
are using:

net.ipv4.ipfrag_max_dist = 0

We also turn off IPv6 on our Ethernet interface, and we thought that
was important/needed, but experimentation with two of three machines
at NOAA/GSL showed that this was not the case (the third machine,
cpsbn1-a2d7 is not yet benefiting from a data path that is not introducing
errors; it is also not configured to receive all channels (PIDs) from
the Novra that it is getting its feed from, AND it is the same Novra
S300N that is feeding all three machines).

re:
> the NOAA/GSL observations aren't
> applicable while the server is in the satellite farm.  Correct?

I do not know what "in the satellite farm" might entail.  I can
say that your setup appears to be much different than NOAA/GSLs.

re:
> What would you recommend for next attempts?

Since the data path from the Novra S300n to your machine is direct
(i.e., not going through routers or switches), I would do the following:

1) replace the Ethernet cable that connects your Novra S300N to you
   machine

2) check the output of 'ifconfig' to see if there are errors being
   reported on the Ethernet interface that your Novra S300N is 
   connected to

   I would like to see the results of 'ifconfig -a' on your machine!

3) I think you said that the signal from your NOAAPort dish is split

   If this is the case, try moving the connection from the splitter to
   the Novra S300N you are using to a different port on the splitter.

4) re-aligning the satellite dish may be in order

   Then again, doing a re-alignment may not be in order especially if
   there are other Novra S300Ns being fed from the same dish, and the
   NOAAPort ingest systems that they are feeding are not showing high
   numbers of Gap messages.

It is hard for me/us to recommend what to do next since we do not
fully understand your downlink setup.  The more information we get
on your setup, the more probable it is that we can recommend something
to try.

re:
> What would be a good way to determine if there is a "noisy feed or
> malfunctioning S300N receiver"?

I strongly recommend that you implement our version of Stonie Cooper's
'novramonitor' monitoring routine, and set it up to log as frequently
as possible.  Running 'cmcs' every so often to monitor how well your
Novra S300N is doing really won't work.  I reference sampling theory...

re:
> Do you suggest we try the BETA version of LDM?

You can, but, like I said in my email yesterday evening, I don't
believe that will have any effect.

re:
> I'm open for the Google Meet.

I'm ready whenever you want.  My objective would be to learn as much
as is possible about your setup so that I have a better mental picture
of how things work/should work.  One of the NOAA/GSL guys and I did
a LOT of Meets during our troubleshooting of his setup.

One last comment:

I just removed the CCs to Steve and my UCAR email addresses.  We are
both monitoring all transactions we are having in our inquiry tracking
system, so CCing us in addition is not adding anything.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: HDQ-517625
Department: Support NOAAPORT
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.