[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TIGGE #LGY-600646]: Re: Missing fields from CMA



Hi YangXin,

re: separate the ingest of data from upstream sites (i.e., ECMWF and NCAR)
from the sending of data to downstream sites (currently ECMWF and NCAR)

> This is our initial design that was based on the LDM runtime structure which 
> is described
> within the LDM Documents.

Yes.  We like your original design.

> Although the computer systems has been availble to implement the separation 
> of the workload
> since it's construction on January 28th, we still use only one node to do the 
> TIGGE Data
> exchange.

Yes, we are aware of this.

> One reason is, we'd been kept on testing to find out the exact cause of the 
> network problem
> (port 388 vs port 8080), so the implementation has been delayed.

OK.

> Now that we can use port 8080 to exchange data in a relative stable manner,

I hope your statement doesn't imply that you want to continue to use the "hack" 
that we
implemented to show that port 388 traffic is being 'packet shaped' (artifically 
rate
limited).  We _strongly_ recommend/request that a concerted effort be made to 
find
the cause of the packet shaping on port 388 and have it removed!

> therefore, it's probably the time for us to do it.

Moving on to the use of multiple LDMs to improve the data exchange among CMA, 
ECMWF, and
NCAR is a good move.  Finding the source of packet shaping on port 388 and 
removing it
would be a _very_ good move.

> The other reason is, technically, I am still not very sure about how to 
> implement it,
> do we need to consider the balance program that is running on each site?

I believe that balance was installed to run at your site simply to demonstrate 
that port
388 traffic was being packet shaped.  Having to run balance is not how the LDM 
should work.

> Moreover, ECMWF's way to do the "offload data processing" seems not exactly 
> the same way
> as the "LDM cluster" which is running at NCAR. Then, based on our current 
> system configuration
> illustrated in my PPT file, I think about a way to do it in CMA with a little 
> bit variation
> to ECMWF, as illustrated in a new attached PPT file.

The slide in your latest PPT file shows 'send' traffic from two cluster nodes 
going through
the LDM on the LDM Director.  Does this imply that your design is to have nodes 
B and C
send data to A (the Directory) which, in turn, sends the data to ECMWF and/or 
NCAR?

> Since at CMA, we can receive nearly 100GBytes/day, while the volume of 
> sending data is
> only 28GBytes/day. If we adopt ECMWF's way, Node A for "relay", B for 
> "outgoing", C for
> "incoming", the load for B and C is not very balance. So, is it possible to 
> perform "receiving"
> and "sending" for both B and C?

Yes, BUT putting both send and receive traffic through the same node shortens 
the residency
time for products in the queue.  The idea about separating sending and 
receiving functions
onto different machines is a way of increasing the residency time of products 
in the LDM
queue so that duplicate product detection and rejection works as designed.

> In this case, does the node A need a big RAM configuration?

The node doing the data reception would need to have the largest LDM queue.  
Again this
is so that duplicate products received would be rejected.

> Are there any LDM instance (processes) running in Node A?

I guess I don't know enough about the setup shown in your second slide to 
comment yet.
I will get together with others here at Unidata tomorrow to discuss your 
question
and see if we can answer appropriately.

re: running uptime.tcl
> I would like to have this 'uptime' script run to monitor our machines.

OK.

> May I know is there any instructions that I can follow to install and use it? 
> Such as an URL.

Unfortunately, no.  This tools is something we cooked up in-house to use for 
ourselves.  It
is not documented (other than in my head), so there is no URL I can point you 
to.

> If yes and I can do it myself step by step, I might have a better 
> understanding of the
> tools and the better using it.

We would be happy to install it for you and then explain what it does.  Again, 
it simply
runs several other utilities; captures relevant output from those utilities; 
formats
the information; and writes it to a disk file.  It is run once-per-minute out 
of cron,
so it should provide little overhead for your machine.

> Thanks.

No worries.  I will send back a better reply tomorrow after several of us have 
read your
email and discussed what we believe to be your proposed setup.

Cheers,

Tom
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: LGY-600646
Department: Support IDD TIGGE
Priority: Normal
Status: Closed