Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1

  • To: John Caron <caron@xxxxxxxx>
  • Subject: Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1
  • From: Heiko Klein <Heiko.Klein@xxxxxx>
  • Date: Tue, 26 May 2015 15:44:36 +0200
Hi John,

you are right, there are only 104 files, so the number of files in the cache files matches the total files. Sorry about adding to the confusion.

Heiko

On 2015-05-25 20:35, John Caron wrote:
Hi Heiko:

Are you sure there are 4500 files? Is it possible there are 100 files
with 49 time coordinates each? If there are ~4500 files with ~50 coords
each, you would see ~225,000 time coordinates.

John

On Fri, May 22, 2015 at 4:34 AM, Heiko Klein <Heiko.Klein@xxxxxx
<mailto:Heiko.Klein@xxxxxx>> wrote:

    Hi John,

    I read now also David Robertsons mail and tried a

    ncdump -v time
    'http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'

    to cache the coordinate values. Performance after that get's
    slightly better, from 20s to 17s for a
    $ time ncdump -h
    'http://vis-m8.met.no/thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml'


    I've had now a closer look at the 'cache' entries as described by
    David. Even after a 'ncdump -v time...' only 100 files (of 4575) are
    cached. The thredds 4.3 cache had also only 100 files cached.

    Questions are now:

    a) Why does 4.6 need all the coordinate values even if not
    requested? (Or why didn't 4.3 need those values?)


    b) Is there a setting to increase the amount of cached aggregation
    files? (I followed more or less blindly
    
http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html,
    so settings with '100' are:
    <NetcdfFileCache>
       <minFiles>100</minFiles>
    ...
    <TimePartition>
       <minFiles>100</minFiles>
    ...
    )

    I attach my threddsConfig.xml and the mbr000_all.ncml file.

    Heiko



    On 2015-05-22 04:39, John Caron wrote:

        ok, so looking closer, i see that there is a single file to
        cache the
        results, so .1 seconds is more likely.

        make sure all files have been cached, by requesting the time
        coordinate
        values in a dods request (see post earlier today).

        if thats the case, and subsequent accesses are still slow, send
        me the
        mbr000_all.ncml file.

        ps, i am gone until monday, so we can resume then...



        On Thu, May 21, 2015 at 6:58 PM, John Caron <caron@xxxxxxxx
        <mailto:caron@xxxxxxxx>
        <mailto:caron@xxxxxxxx <mailto:caron@xxxxxxxx>>> wrote:

             reading 4575 files in .1 seconds seems a bit too fast. Im
        guessing
             that the dataset is actually getting cached in memory, and
        you are
             seeing that performance. Then the question might be why
        isnt that
             happening as fast in 4.6?

             If you have "remote management" enabled, you can see what
        files are
             in the cache, and also clear the cache.

        
http://www.unidata.ucar.edu/software/thredds/v5.0/tds/reference/RemoteManagement.html

             so, how long does it take in each version:

             1) the first time that the aggregation is called, when theres
             nothing in the disk cache (eg
        lustre/mnt/heikok/FAUNA/mbr000_all.ncml)
             2) the first time that the aggregation is called after the TDS
             starts up, when the disk cache is populated, but the
        dataset is not
             in memory (or it has been cleared from memroy)
             3) how long it takes after its in memory.

             i believe that 2) could take 7 secs, and 3) takes .1
        second, and
             maybe 1) takes 10-120 secs.

             its possible that for 4.6, 2) has slowed down to 20 secs,
        and maybe
             its not getting memory cached so 3) never happens. I will
             investigate that possibility.

             if you get a chance to experiment with checking/clearing memory
             cache with the 2 versions, let me know the results.

             John

             On Wed, May 20, 2015 at 10:41 AM, John Caron
        <caron@xxxxxxxx <mailto:caron@xxxxxxxx>
             <mailto:caron@xxxxxxxx <mailto:caron@xxxxxxxx>>> wrote:

                 ok, we'll see if we can reproduce the problem.



                 On Wed, May 20, 2015 at 10:25 AM, Heiko Klein
                 <heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>
        <mailto:heiko.klein@xxxxxx <mailto:heiko.klein@xxxxxx>>> wrote:

                     Hi John,

                     in 4.6.1, request-time stays at ~20s each time I
        try it.
                     Only in 4.3.23 I see a huge perfomance-gain (from 7s to
                     0.1s) after the first fetch.

                     Heiko

                     ----- Original Message -----
                      > Hi Heiko:
                      >
                      > Can you see whether the second time you access the
                     dataset, if the times
                      > are fast again?
                      >
                      > thanks,
                      > John
                      >
                      > On Wed, May 20, 2015 at 2:07 AM, Heiko Klein
                     <Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>
        <mailto:Heiko.Klein@xxxxxx <mailto:Heiko.Klein@xxxxxx>>> wrote:
                      >
                      > > Hi,
                      > >
                      > > I have some performance problems after
        upgrading to
                     thredds 4.6.1.
                      > >
                      > > I'm aggregating a large dataset with a
        joinExisting
                     aggregation. Reading
                      > > the metadata from the aggregation took, with
        thredds
                     4.3.23, about 0.1s
                      > > (first time up to 7s). After upgrading to
        4.6.1, the
                     same request takes 20s
                      > > (second time, first time not measured) and is
        unusable
                     slow. A 'ncview' of
                      > > the aggregated dataset is no longer possible.
                      > >
                      > > Suspecting some caching problems, I followed the
                     guidelines in
                      > >
        
http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html
                      > > The aggregation-cache contains two files
                      > >
                      > > $ ls -l
        file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
                      > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
                      > > -rw-r--r-- 1 tomcat7 tomcat7 74339 May 20 09:43
                      > > file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
                      > > -rw-r--r-- 1 tomcat7 tomcat7 74131 May 20 10:01
                      > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
                      > >
                      > > (I guess the first one is thredds 4.3, while
        the second
                     one is thredds 4.6)
                      > >
                      > >
                      > > The ncml-file is:
                      > >
                      > > <netcdf

        xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2";>
                      > >   <aggregation dimName="time" type="joinExisting">
                      > >     <scan location="." regExp=".*snapMet000\.nc$"
                     subdirs="true"/>
                      > >   </aggregation>
                      > > </netcdf>
                      > >
                      > >
                      > > It is aggregating 4575 files, each about 1.3G.
                      > >
                      > >
                      > > The requests according to the logs are:
                      > > 2015-05-20T09:49:17.111 +0200 [    372240][
          281]
                     INFO  -
                      > > threddsServlet - Remote host: 157.249.113.42 -
        Request:
                     "GET
                      > >

        /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
                     HTTP/1.1"
                      > > 2015-05-20T09:49:17.116 +0200 [    372245][
          281]
                     INFO  -
                      > > threddsServlet - Request Completed - 200 - -1 - 5
                      > > 2015-05-20T09:49:17.117 +0200 [    372246][
          282]
                     INFO  -
                      > > threddsServlet - Remote host: 157.249.113.42 -
        Request:
                     "GET
                      > >

        /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.das
                     HTTP/1.1"
                      > > 2015-05-20T09:49:17.131 +0200 [    372260][
          282]
                     INFO  -
                      > > threddsServlet - Request Completed - 200 - -1 - 14
                      > > 2015-05-20T09:49:17.134 +0200 [    372263][
          283]
                     INFO  -
                      > > threddsServlet - Remote host: 157.249.113.42 -
        Request:
                     "GET
                      > >

        /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds
                     HTTP/1.1"
                      > > 2015-05-20T09:49:17.138 +0200 [    372267][
          283]
                     INFO  -
                      > > threddsServlet - Request Completed - 200 - -1 - 4
                      > >
                      > >
                      > > Best regards,
                      > >
                      > > Heiko
                      > >
                      > >
                      > > --
                      > > Dr. Heiko Klein                   Norwegian
                     Meteorological Institute
                      > > Tel. + 47 22 96 32 58
        <tel:%2B%2047%2022%2096%2032%2058>
                     <tel:%2B%2047%2022%2096%2032%2058>             P.O.
        Box 43
                     Blindern
                      > > http://www.met.no                 0313 Oslo NORWAY
                      > >
                      > > _______________________________________________
                      > > thredds mailing list
                      > > thredds@xxxxxxxxxxxxxxxx
        <mailto:thredds@xxxxxxxxxxxxxxxx>
        <mailto:thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>>
                      > > For list information or to unsubscribe,  visit:
                      > > http://www.unidata.ucar.edu/mailing_lists/
                      > >
                      >






--
Dr. Heiko Klein                   Norwegian Meteorological Institute
Tel. + 47 22 96 32 58             P.O. Box 43 Blindern
http://www.met.no                 0313 Oslo NORWAY



  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: