Re: [ldm-users] new and unusual problem

  • To: Patrick Finnegan <vax@xxxxxxxxxx>
  • Subject: Re: [ldm-users] new and unusual problem
  • From: Karen Cooper - NOAA Affiliate <karen.cooper@xxxxxxxx>
  • Date: Tue, 16 Feb 2021 15:20:34 -0600
Yes, the "ping -s 1500 " fails as you suggest.  I'm not sure why or where
it might be happening though.

I tried to use that ping to several locations, and it failed to some, but
not to others -- which indicates something along the path is causing the
problem.  I wish I knew how to determine where that problem lies.  Still
investigating though.

Thanks for the help.

On Tue, Feb 16, 2021 at 11:47 AM Patrick Finnegan <vax@xxxxxxxxxx> wrote:

> This sounds like a link MTU size problem.  Try a "ping -s 1500" between
> the two machines, and see if that works.
>
> It's likely that something is set to do jumbo frames (>1500 byte MTU) and
> something in the middle is limiting that to the standard 1500 byte MTU
> size.  (or something like 1500 byte MTU size on a link that has VLAN tags
> or VPLS headers).
>
> Patrick Finnegan
> Data Center Architect
> Research Computing
> Purdue University
>
>
> On Tue, Feb 16, 2021 at 11:52 AM Karen Cooper - NOAA Affiliate via
> ldm-users <ldm-users@xxxxxxxxxxxxxxxx> wrote:
>
>> tl;dr  -- LDM setup which worked fine last week, now will not
>> send/receive any files larger than 1292 bytes.
>>
>> Full story:
>> We get data via LDM from another system.   This setup/connection was
>> working fine until last week.  As far as we know, no changes were made --
>> but obviously something has changed, because now we can't get data.
>>
>> Both ends are running ldm-6.13.11, which is recent and has been working
>> well (except for pqact issues, which don't apply here).
>>
>> I see connectivity at both ends, and I have restarted and rebuilt the
>> queues on both ends multiple tiles during troubleshooting.
>>
>> I have enabled traffic both ways, and can ldmping and run notifyme
>> against the other machines queue(s).
>>
>> Interestingly enough the issue seems to have something to do with
>> filesize.  In my testing I tried using ldmsend to send files to the
>> downstream server.  I have an "accept" line there, and I *AM* able to send
>> files* IF* they are <1293 bytes.   The downstream server receives data
>> from many other servers, and many of the files it receives are larger than
>> 1293 bytes.
>>
>> Interestingly, smaller files make it through, but are taking a
>> significantly long time.  For instance a file of 1274 bytes can take more
>> than a minute.
>>
>> When trying to send the larger file, there is nothing in the downstream
>> logs, but the upstream logs show:
>>
>> 20210216T163901.154847Z dontpanic.nssl.noaa.gov(feed)[20925]
>> up6.c:up6_run:445NOTE  Starting Up(6.13.11/6): 20210216162900.110949
>> TS_ENDT {{EXP, "/home/operator/ALAtest"}},
>> SIG=d40ffc815fd74a96c2d7c726dc7012d3, Primary
>> 20210216T163901.154950Z dontpanic.nssl.noaa.gov(feed)[20925]
>> up6.c:up6_run:448NOTE  topo:  dontpanic.nssl.noaa.gov {{EXP, (.*)}}
>> 20210216T164000.271093Z 140.172.25.37[20982]ldmd.c:cleanup:192NOTE
>>  Exiting
>> 20210216T164001.213937Z dontpanic.nssl.noaa.gov(feed)[20925]
>> ldmd.c:cleanup:192NOTE  Exiting
>>
>> I tried setting up a second downstream system, but had the same results.
>>
>> I have also tried using ldmsend to send data, but again, the small files
>> make it through, but larger packets fail.  In verbose mode for ldmsend I
>> see:
>>
>> ldmsend -xxx -h dontpanic.nssl.noaa.gov ALAtestfile7
>> 20210216T164634.300292Z ldmsend[21540]              error.c:err_log:236
>>               INFO  Resolving dontpanic.nssl.noaa.gov to 140.172.25.37
>> took 0.000755 seconds
>> 20210216T164634.329557Z ldmsend[21540]              ldmsend.c:main:437
>>                DEBUG version 6
>> 20210216T164634.359151Z ldmsend[21540]              ldmsend.c:ldmsend:281
>>               INFO  Sending ALAtestfile7, 1293 bytes
>> 20210216T164634.359234Z ldmsend[21540]
>>  LdmProxy.c:my_hereis_6:549          DEBUG Sending file via HEREIS_6
>> 20210216T164734.361874Z ldmsend[21540]
>>  LdmProxy.c:getStatus:68             ERROR NULLPROC_6 failure to host "
>> dontpanic.nssl.noaa.gov": RPC: Unable to recei
>> ve; errno = Connection reset by peer
>> 20210216T164734.361940Z ldmsend[21540]              ldmsend.c:ldmsend:309
>>               ERROR Couldn't flush connection
>> 20210216T164734.362006Z ldmsend[21540]              ldmsend.c:cleanup:82
>>                ERROR Message-queue isn't empty
>>
>>
>>
>> --
>> *"Outside of a dog, a book is a man's best friend.  Inside of a dog, it's
>> too dark to read."*
>> *--Groucho Marx*
>>
>> -------------------------------------------
>> Karen.Cooper@xxxxxxxx
>>
>> Phone#:  405-325-6456
>> Cell:   405-834-8559
>> National Severe Storms Laboratory
>>
>> _______________________________________________
>> NOTE: All exchanges posted to Unidata maintained email lists are
>> recorded in the Unidata inquiry tracking system and made publicly
>> available through the web.  Users who post to any of the lists we
>> maintain are reminded to remove any personal information that they
>> do not want to be made public.
>>
>>
>> ldm-users mailing list
>> ldm-users@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe,  visit:
>> https://www.unidata.ucar.edu/mailing_lists/
>>
>

-- 
*"Outside of a dog, a book is a man's best friend.  Inside of a dog, it's
too dark to read."*
*--Groucho Marx*

-------------------------------------------
Karen.Cooper@xxxxxxxx

Phone#:  405-325-6456
Cell:   405-834-8559
National Severe Storms Laboratory
  • 2021 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: