Re: [thredds] improved performances through GPFS

Hi John, Robert,

Thanks very much for your replies.

From what I understand now, TDS would take advantage of GPFS to optimize request prcessing between several users (in different threads reading files in parallel) if the GPFS is parameterize to work like HADOOP (using IBM's FPO). For a single user request the performances would be roughly equivalent (if files are read sequentially by TDS).
We can test this. We'll let you know.

In addition, we'll investigate to read a long list of netcdf files in parallel in different threads (5, 10, more ?) and see. We can do it in standalone benchmark or in one of our application server (oceanotron). We'll let you know about this as well.

Thomas





On 03/01/2016 06:36 PM, Robert Casey wrote:

Hi Thomas and John-

From what I have been able to gather, GPFS is a parallel cluster that behaves according to POSIX standards and looks to an OS just like any other file mount. You should be able to use all of the same file I/O commands you already use. Not aware of any specialized enhancements. All of the I/O libraries for GPFS appear to be very low level. It's optimized for fast parallel reads and writes and parallelizes the metadata servers to each disk node as well, which is much more capable than even Parallel NFS. Looks like it is a good alternative to using HDFS based on this article.

http://www.datanami.com/2014/02/18/what_can_gpfs_on_hadoop_do_for_you_/

As they suggest, you can get Hadoop like behavior on GPFS by using IBM's File Placement Optimization (FPO), mapping compute cycles to each of the data nodes in parallel.

-Rob

On Mar 1, 2016, at 8:57 AM, John Caron <jcaron1129@xxxxxxxxx <mailto:jcaron1129@xxxxxxxxx>> wrote:

Hi Thomas:

TDS uses standard Java interfaces to the filesystem, so it wouldnt be taking advantage of anything that needed special commands. Both the netcdf library and TDS are thread-safe, so can scale up to large number of simultaneous requests, so it seems likely that a clustered Tomcat environment would work well.

Perhaps by distributing data correctly over data nodes, significant improvements might be possible. So much depends on access patterns, so a good way to proceed would be to create a synthentic load (eg script a bunch of requests to the TDS) that mimics what you expect users to need, and measure performance as you modify your system.

I dont know enough about GPFS to know what features could be used to go beyond what you get from posix API. Anyone else?

John

On Thu, Feb 25, 2016 at 2:27 AM, Thomas LOUBRIEU <thomas.loubrieu@xxxxxxxxxx <mailto:thomas.loubrieu@xxxxxxxxxx>> wrote:

    Dear all,

    In our data center, the new high-performance clustered file
    system we're going to use is GPFS (General Parallel File System).
    I am wondering is java-netcdf library or thredds data server can
    take benefit of this high performance file system if the netcdf
    files are stored on it ?

    Are you aware of work being done or systems working with GPFS or
    otherwise on similar high performance systems (HDFS, moosefs,
    ...). I am definitely not an expert and any information regarding
    reading netcdf in java on these clustered file system (preferably
    GPFS) would help us very much.

    Thanks,

    Thomas

    _______________________________________________
    thredds mailing list
    thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
    For list information or to unsubscribe,  visit:
http://www.unidata.ucar.edu/mailing_lists/

_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/



_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/

  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: