[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #BQD-554715]: ldm restart



Carol,

> On a related note, I've been having issues keeping a consistent
> connection from amrc.ssec.wisc.edu to aws.ssec.wisc.edu (these servers
> are on the same network). In this case, amrc is the downstream server
> again. The only way I can seem to reset the connection is to restart LDM
> on amrc.
> 
> Here's a link to the log file from amrc.ssec.wisc.edu. I know this is a
> mess, but you'll be able to see where the aws connection is refused.
> ftp://amrc.ssec.wisc.edu/pub/requests/semmerson/ldmd.amrc.log

The connections from the LDM on Amrc to Aws are initially refused because
the LDM on Aws isn't listening on the LDM's well-known port, 388. Instead,
the LDM on Aws listens on ephemeral port 45273. I suspect the LDM program
on Aws (file ~ldm/bin/ldmd) isn't owned by root and setuid.

> I don't think the 2 servers are connected for the whole duration of this
> log file.

Not so. Downstream LDM process 10168 on Amrc received many products.

I noticed that the REQUESTs from Amrc to Aws have the potential to overlap:

    20170414T120004.095829Z aws.ssec.wisc.edu[10168] NOTE 
ldm_config_file.c:733:requester_exec() Starting Up(6.13.2): 
aws.ssec.wisc.edu:388 20170414110004.095766 TS_ENDT {{EXP, "(USAP|ANT)"}}
    ...
    20170414T120004.099277Z aws.ssec.wisc.edu[10175] NOTE 
ldm_config_file.c:733:requester_exec() Starting Up(6.13.2): 
aws.ssec.wisc.edu:388 20170414110004.099226 TS_ENDT {{EXP, 
"ANT.AMRC.syn_chart.*"}}
    ...
    20170414T120004.099387Z aws.ssec.wisc.edu[10176] NOTE 
ldm_config_file.c:733:requester_exec() Starting Up(6.13.2): 
aws.ssec.wisc.edu:388 20170414110004.099340 TS_ENDT {{EXP, "SSEC*"}}
    ...
    20170414T120004.099545Z aws.ssec.wisc.edu[10177] NOTE 
ldm_config_file.c:733:requester_exec() Starting Up(6.13.2): 
aws.ssec.wisc.edu:388 20170414110004.099499 TS_ENDT {{EXP, "ANT.AMRC.OWL."}}

The pattern in the first entry ("USAP|ANT*") overlaps the second
("ANT.AMRC.syn_chart.*") and fourth ("ANT.AMRC.OWL."). This can cause
problems with the LDM and might be responsible for the problems you're
seeing. Note the number of duplicate products received by process 10176.

Additionally, the ".*" suffix in second and fourth patterns adds nothing
and might slow-down pattern matching.

Also, the third pattern, as short at it is, risks matching everything with
"SSEC" anywhere in its product-identifier.

You might try using regular expression anchors (i.e., "^" for beginning of
string and "$" for end of string) and/or fleshing-out the patterns to
eliminate the overlaps.

In general, 1) requests to the same upstream LDM *must* be disjoint; and
2) requests to different upstream LDM-s *must either* be disjoint or
identical.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: BQD-554715
Department: Support LDM
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.