[thredds] Non-linear growth of query time for NCSS point request

  • To: "thredds@xxxxxxxxxxxxxxxx" <thredds@xxxxxxxxxxxxxxxx>
  • Subject: [thredds] Non-linear growth of query time for NCSS point request
  • From: Marcelo Andrioni <marceloandrioni@xxxxxxxxxxxxxxxx>
  • Date: Tue, 7 Jan 2020 16:46:31 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=petrobras.com.br; dmarc=pass action=none header.from=petrobras.com.br; dkim=pass header.d=petrobras.com.br; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4pMAKciBL6vKJ2ikCnBQJ1UPT0qt28GoxB2vpO8p2mM=; b=KMoOONrmv0Eu0QLKd7rtAS28mzHNewaBkdPr5LU9CyoXi+YCD+SYRo9lVUHT6IMp2UiZX9uQ3VmwZV2OhLl3qez/Y2Rsl88DGq7tRDXzcnhCyeByXBb/LH36ctJ8hLyx+rnGrTKf6Ac98WwzPxjEvWjm6RJXkgXJNk1SCg7O94dvUUAoq99Zl9F66BJKuFef8SKRWJOpJLOSsl/Qq1pfCjkwwTtEUf/+dfEJ2PQWsWP1mM2Z7nUUZnycnfT3MlPQWin4Oxy5Kn5DCBqeFWd/cYJis26OQ+cbtOlmCc4tfxAXjizYHCZDoeJS6sGkF0Nt8dh+BWtLlD0HshQK7WPZMQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HER1pYCMR8g5aegKCcEbAYuD+7iYgH2jjcRohZIIrS+wSrUn0dwB1gdHLr5l02SrSFlouXmKvCcg7U6igGks+3SuS0RQRWOPxxJHb5EtQ/bm4TbdyQ+oPykvNfD7wIykBC8mPjc6zYMAw5rZGmXMB9pJO8L1yM0Aonb22IvhunPD66REtktx+J41qcMlYclQULMxD4flJm0mMZi/tassNdr8PAjfEqRI38Gefwm6YUWoytsNeRpbbTACBPehUEpGl6Xu6a3SPqTHnLlrj5ru0Kcwv37CGI1WnJmWAs+9LDsCj3Bk+XUMEm0TwuvC/vKqYYwFjAAH47infV1mBy3p8g==
  • Authentication-results: correio2.petrobras.com.br; spf=Pass smtp.mailfrom=marceloandrioni@xxxxxxxxxxxxxxxx; dkim=pass (signature verified) header.i=@petrobrasbr.onmicrosoft.com; dmarc=pass (p=quarantine dis=none) d=petrobras.com.br
  • Ironport-sdr: MQtwo8bKy2wKf17sSMEwS1mog6f7Rqemj+bOPl5mRLkFydCYbs6YcsTQQq8BbYQeVA2kVlWO2p gxMarvFMRlpQ==

I am seeing some problems when running very long requests from a single 
location using NCSS.
My dataset has 40 years of monthly files with hourly data (from ERA5). The 
dataset dimensions are: 
time 355728 X latitude 176 X longitude 172

When requesting one year of data for four variables for a single location (NCSS 
Grids As Point Data) it takes 5.2 seconds. I would expect that two years would 
take 10.4s and so forth, but I am seeing a non-linear growth in the time. I run 
a few tests using wget to retrieve the data and got:

query (years)   real time (s)   expected time (s)
1       5.2     5.2
2       13      10.4
3       23      15.6
4       37      20.8
5       53      26
6       70      31.2
7       91      36.4
8       117     41.6
9       142     46.8
10      175     52

The query for the whole dataset (40 years), expected to take 208s (5.2 x 40) 
took 53 minutes (3180s). Did anyone faced a similar problem? Maybe I need to 
make some changes to the cache configuration?

The results are for a dataset aggregated with '<aggregation dimName="time" 
type="joinExisting">', but similar results were found for this dataset when 
using FMRC (a little slower actually).

The system configuration is:
THREDDS 4.6.14
Apache Tomcat/8.5.45
Linux Kernel 4.15.0-72-generic

Thank you.

Marcelo Andrioni

  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: