Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1

  • To: Heiko Klein <heiko.klein@xxxxxx>
  • Subject: Re: [thredds] ncml aggregation 200 times slower in thredds 4.6.1
  • From: John Caron <caron@xxxxxxxx>
  • Date: Thu, 21 May 2015 18:58:22 -0600
reading 4575 files in .1 seconds seems a bit too fast. Im guessing that the
dataset is actually getting cached in memory, and you are seeing that
performance. Then the question might be why isnt that happening as fast in
4.6?

If you have "remote management" enabled, you can see what files are in the
cache, and also clear the cache.

http://www.unidata.ucar.edu/software/thredds/v5.0/tds/reference/RemoteManagement.html

so, how long does it take in each version:

1) the first time that the aggregation is called, when theres nothing in
the disk cache (eg  lustre/mnt/heikok/FAUNA/mbr000_all.ncml)
2) the first time that the aggregation is called after the TDS starts up,
when the disk cache is populated, but the dataset is not in memory (or it
has been cleared from memroy)
3) how long it takes after its in memory.

i believe that 2) could take 7 secs, and 3) takes .1 second, and maybe 1)
takes 10-120 secs.

its possible that for 4.6, 2) has slowed down to 20 secs, and maybe its not
getting memory cached so 3) never happens. I will investigate that
possibility.

if you get a chance to experiment with checking/clearing memory cache with
the 2 versions, let me know the results.

John

On Wed, May 20, 2015 at 10:41 AM, John Caron <caron@xxxxxxxx> wrote:

> ok, we'll see if we can reproduce the problem.
>
>
>
> On Wed, May 20, 2015 at 10:25 AM, Heiko Klein <heiko.klein@xxxxxx> wrote:
>
>> Hi John,
>>
>> in 4.6.1, request-time stays at ~20s each time I try it. Only in 4.3.23 I
>> see a huge perfomance-gain (from 7s to 0.1s) after the first fetch.
>>
>> Heiko
>>
>> ----- Original Message -----
>> > Hi Heiko:
>> >
>> > Can you see whether the second time you access the dataset, if the times
>> > are fast again?
>> >
>> > thanks,
>> > John
>> >
>> > On Wed, May 20, 2015 at 2:07 AM, Heiko Klein <Heiko.Klein@xxxxxx>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I have some performance problems after upgrading to thredds 4.6.1.
>> > >
>> > > I'm aggregating a large dataset with a joinExisting aggregation.
>> Reading
>> > > the metadata from the aggregation took, with thredds 4.3.23, about
>> 0.1s
>> > > (first time up to 7s). After upgrading to 4.6.1, the same request
>> takes 20s
>> > > (second time, first time not measured) and is unusable slow. A
>> 'ncview' of
>> > > the aggregated dataset is no longer possible.
>> > >
>> > > Suspecting some caching problems, I followed the guidelines in
>> > >
>> http://www.unidata.ucar.edu/software/thredds/current/tds/reference/ThreddsConfigXMLFile.html
>> > > The aggregation-cache contains two files
>> > >
>> > > $ ls -l file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>> > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>> > > -rw-r--r-- 1 tomcat7 tomcat7 74339 May 20 09:43
>> > > file-lustre-mnt-heikok-FAUNA-mbr000_all.ncml
>> > > -rw-r--r-- 1 tomcat7 tomcat7 74131 May 20 10:01
>> > > lustre/mnt/heikok/FAUNA/mbr000_all.ncml
>> > >
>> > > (I guess the first one is thredds 4.3, while the second one is
>> thredds 4.6)
>> > >
>> > >
>> > > The ncml-file is:
>> > >
>> > > <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2
>> ">
>> > >   <aggregation dimName="time" type="joinExisting">
>> > >     <scan location="." regExp=".*snapMet000\.nc$" subdirs="true"/>
>> > >   </aggregation>
>> > > </netcdf>
>> > >
>> > >
>> > > It is aggregating 4575 files, each about 1.3G.
>> > >
>> > >
>> > > The requests according to the logs are:
>> > > 2015-05-20T09:49:17.111 +0200 [    372240][     281] INFO  -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request: "GET
>> > > /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds HTTP/1.1"
>> > > 2015-05-20T09:49:17.116 +0200 [    372245][     281] INFO  -
>> > > threddsServlet - Request Completed - 200 - -1 - 5
>> > > 2015-05-20T09:49:17.117 +0200 [    372246][     282] INFO  -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request: "GET
>> > > /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.das HTTP/1.1"
>> > > 2015-05-20T09:49:17.131 +0200 [    372260][     282] INFO  -
>> > > threddsServlet - Request Completed - 200 - -1 - 14
>> > > 2015-05-20T09:49:17.134 +0200 [    372263][     283] INFO  -
>> > > threddsServlet - Remote host: 157.249.113.42 - Request: "GET
>> > > /thredds/dodsC/lustreMnt/heikok/FAUNA/mbr000_all.ncml.dds HTTP/1.1"
>> > > 2015-05-20T09:49:17.138 +0200 [    372267][     283] INFO  -
>> > > threddsServlet - Request Completed - 200 - -1 - 4
>> > >
>> > >
>> > > Best regards,
>> > >
>> > > Heiko
>> > >
>> > >
>> > > --
>> > > Dr. Heiko Klein                   Norwegian Meteorological Institute
>> > > Tel. + 47 22 96 32 58             P.O. Box 43 Blindern
>> > > http://www.met.no                 0313 Oslo NORWAY
>> > >
>> > > _______________________________________________
>> > > thredds mailing list
>> > > thredds@xxxxxxxxxxxxxxxx
>> > > For list information or to unsubscribe,  visit:
>> > > http://www.unidata.ucar.edu/mailing_lists/
>> > >
>> >
>>
>
>
  • 2015 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: