[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030627: HDS feed to/from seistan (cont.)



>From: Robert Leche <address@hidden>
>Organization: LSU
>Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD

Hi Bob,

re: ULM rerouted their traffic from I2 to "I1:

>I did not know this happened, but it explains why ULM is able to communicate
>with rainbow.al.noaa.gov.

The ULM folks told us that during a total outage at LSU at some point in
the past they fed from thelma.ucar.edu and experienced no problems.  This
predated either your or ULM's upgrade to LDM-6 by quite a bit.

Here a portion of the original note we received on problems ULM was
having feeding from srcc.lsu.edu:

  "For more than a year, we have been having serious data feed problems
  when our upstream site is at LSU (sirocco).  We have tried everything
  that we can, including contacting LSU repeatedly, but cannot seem to
  resolve the situation satisfactorily.  We have worked extensively with
  our network people and believe that the problem is at LSU.  We are
  basing this conclusion on the fact that, while sirocco was down and we
  were feeding from Unidata's thelma machine, everything was fine.  We
  received all data without significant losses.  However, once sirocco
  came on-line again and we switched over to them, we began to experience
  substantial losses of data.  Our fallback site is OU's stokes machine
  and we have used them in the past, but they are feeding so many sites
  that we tend to fall significantly behind in the data feed.
  
  Can you help us resolve this problem?"

>It would be interesting to also force an I1 connection to LSU and repeat
>the test. 

I agree, running feed tests using a different route to/from LSU would
certainly be welcome.

re: "I1"

>Internet one?

That is what we asked.

>A better question in this case is, what is I2 in the context
>to  the LANET sonnet connecting ULM to LANET?

Here is the route from ULM to seistan.srcc.lsu.edu:

                           Matt's traceroute  [v0.49]
tornado.geos.ulm.edu                                   Fri Jun 27 10:56:14 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. 10.16.0.1                              0%   18   18     1    1    1      1
 2. 10.1.1.1                               0%   18   18     0    0    0      1
 3. 198.232.231.1                          0%   18   18     0    0    0      1
 4. laNoc-ulm.LEARN.la.net                 0%   17   17    13   13   19     76
 5. lsubr-laNoc.LEARN.la.netponse 2. (serve0%   17   17    14   14   15     26
 6. howe-e241a-4006-dsw-1.g1.lsu.edu       0%   17   17    18   15   22     50
 7. seistan.srcc.lsu.edu                   0%   17   17    15   14   19     42


This can be compared with LSU's route from seistan to tornado.geos.ulm.edu:


                           Matt's traceroute  [v0.49]
seistan.srcc.lsu.edu                                   Fri Jun 27 10:58:56 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. 130.39.188.1                           0%   11   11     4    1    2      5 
 2. lsubr1-118-6509-dsw-1.g2.lsu.edu       0%   11   11     1    0    1      1 
 3. laNoc-lsubr.LEARN.la.net               0%   11   11     2    1    2      4 
 4. ulm-laNoc.LEARN.la.net                 0%   11   11    14   14   36     91 
 5. 198.232.231.2                          0%   11   11    29   14   41    127 
 6. dynip422.nat.ulm.edu                   0%   11   11    16   15   25     61
 7. tornado.geos.ulm.edu                   0%   10   10    15   14   16     23
Resolver: Received error response 2. (server failure)


>My limited understanding of
>what I2 is, is that  traffic is I2 if it passes through Abilene's system.

I believe that is correct.

>That being the case, unless ULM is passing through Abilenes routers, ULM
>is really on I1 anyway.

Please see the route above.  This, at least, reflects ULM's current
connection to LSU.  UCAR's connection to ULM, however, traverses I2
until Houston where the bridge is made to LEARN.La.Net:

zero.unidata.ucar.edu -> tornado.geos.ulm.edu:

                           Matt's traceroute  [v0.44]
zero.unidata.ucar.edu                                  Fri Jun 27 12:02:58 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. flra-n140.unidata.ucar.edu             0%   71   71     0    0    0     29
 2. gin-n243-80.ucar.edu                   0%   71   71     0    0    0      6
 3. frgp-gw-1.frgp.net                     0%   71   71     1    1    2     25
 4. 198.32.11.105                          0%   71   71     1    1    1      6
 5. kscyng-dnvrng.abilene.ucaid.edu        0%   71   71    12   12   13     26
 6. hstnng-kscyng.abilene.ucaid.edu        0%   71   71    27   27   27     27
 7. laNoc-abileneHou.LEARN.La.Net          0%   71   71    33   32   33     36
 8. ulm-laNoc.LEARN.La.Net                 0%   70   70    45   45   46     71
 9. ???

tornado.geos.ulm.edu -> zero.unidata.ucar.edu

                           Matt's traceroute  [v0.49]
tornado.geos.ulm.edu                                   Fri Jun 27 13:04:05 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. 10.16.0.1                              0%    4    4     1    1    1      1
 2. 10.1.1.1                               0%    4    4     0    0    0      0
 3. 198.232.231.1                          0%    4    4     0    0    0      0
 4. laNoc-ulm.LEARN.la.net                 0%    4    4    13   13   13     13
 5. abileneHou-laNoc.LEARN.la.net 2. (serve0%    4    4    18   18   25     45
 6. kscyng-hstnng.abilene.ucaid.edu        0%    3    3    34   34   34     34
 7. dnvrng-kscyng.abilene.ucaid.edu        0%    3    3    44   44   44     44
 8. 198.32.11.106                          0%    3    3    44   44   44     45
 9. gin.ucar.edu                           0%    3    3    46   45   45     46
10. flrb.ucar.edu                          0%    3    3    45   45   46     46
11. zero.unidata.ucar.edu                  0%    3    3    56   45   49     56


re: ULM rerouted away from the problematic I2 connection

>LANET indicated this trouble ticket
>has been open for "some time". We do not know what "some time" means in terms
>of  days or months.

It would be useful to know how long that trouble ticket has been open.

>CRC, and retransmission errors are consistent with delays
>in network traffic.

I agree.

re: is CRC and retransmission (trouble ticket at LANET) affecting LSU also

>I  think the communication issue will require resolving before we will
>know.

The really strange part is the asymmetry in the problem.  Since we are
are feeding seistan.srcc.lsu.edu the HDS stream from
emo.unidata.ucar.edu with no latencies, while at the same time we are
_unable_ to feed the data back to a different machine here at the UPC,
zero.unidata.ucar.edu (zero and emo are in the same room on the same
subnet), perhaps a look at the route from Unidata to seistan and back
again would be instructive:

zero.unidata.ucar.edu -> seistan.srcc.lsu.edu

                           Matt's traceroute  [v0.44]
zero.unidata.ucar.edu                                  Fri Jun 27 10:16:40 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. flra-n140.unidata.ucar.edu             0%    8    8    10    0    1     10
 2. gin-n243-80.ucar.edu                   0%    8    8     0    0    0      0
 3. frgp-gw-1.frgp.net                     0%    8    8     1    1    1      2
 4. 198.32.11.105                          0%    8    8     1    1    1      1
 5. kscyng-dnvrng.abilene.ucaid.edu        0%    8    8    22   12   13     22
 6. hstnng-kscyng.abilene.ucaid.edu        0%    8    8    27   27   27     27
 7. laNoc-abileneHou.LEARN.La.Net          0%    8    8    33   33   33     33
 8. lsubr-laNoc.LEARN.La.Net               0%    8    8    34   34   34     34
 9. howe-e241a-4006-dsw-1.g2.lsu.edu       0%    8    8    39   35   37     42
10. seistan.srcc.lsu.edu                   0%    7    7    34   34   34     35


seistan.srcc.lsu.edu -> zero.unidata.ucar.edu

                           Matt's traceroute  [v0.49]
seistan.srcc.lsu.edu                                   Fri Jun 27 11:15:53 2003
Keys:  D - Display mode    R - Restart statistics    Q - Quit
                                           Packets               Pings
Hostname                                %Loss  Rcv  Snt  Last Best  Avg  Worst
 1. 130.39.188.1                           0%   14   14     1    1    3     16
 2. lsubr1-118-6509-dsw-1.g2.lsu.edu       0%   14   14     0    0    1      6
 3. laNoc-lsubr.LEARN.la.net               0%   14   14     2    1    2      5
 4. abileneHou-laNoc.LEARN.la.net          0%   14   14     8    7   16     46
 5. kscyng-hstnng.abilene.ucaid.edu        0%   14   14    23   22   22     23
 6. dnvrng-kscyng.abilene.ucaid.edu        0%   14   14    33   33   36     71
 7. 198.32.11.106                          0%   14   14    34   33   36     59
 8. gin.ucar.edu                           0%   14   14    35   34   35     45
 9. flrb.ucar.edu                          0%   14   14    34   34   35     45
10. zero.unidata.ucar.edu                  0%   13   13    34   34   36     57


The major difference in routes that I notice is the route from zero
to seistan goes through howe-e241a-4006-dsw-1.g2.lsu.edu, but the
route from seistan to zero goes through lsubr1-118-6509-dsw-1.g2.lsu.edu.

Perhaps this is a big clue that we are overlooking?  Could it be
that there is something amiss on the howe-e241a-4006-dsw-1.g2.lsu.edu
gateway/router?

re: What did the telecomm folks have to say about the asymmetry seen moving
data to/from srcc.lsu.edu from zero.unidata.ucar.edu?

>The issue of asymmetry was not the paramount issue with telecom. Again, the
>telecom guys want to wait and see  the communications issues are fixed, as
>they believe the errors in the circuit are causing the problems between LSU
>and ULM.  

The problem is not _just_ between LSU and ULM.  We (zero.unidata.ucar.edu)
are seeing the exact same problem that ULM was seeing when trying to
feed HDS from seistan.srcc.lsu.edu.  Moreover, we saw the exact same
problem during our test of feeding the HDS stream from
seistan.srcc.lsu.edu to the University of South Florida machine,
metlab.cas.usf.edu.  The problem most likely exists between seistan
and Jackson State, but we can't verify this because they are not reporting
stats AND we do not have current contact information for them.

If the LSU telecomm folks are under the impression that the only
problem is between LSU and ULM, then they need to be contacted and made
aware of the problems going to such diverse sites as UCAR and USF.

Tom