[thredds] THREDDS/NCSS/Open Files

  • To: THREDDS community <thredds@xxxxxxxxxxxxxxxx>
  • Subject: [thredds] THREDDS/NCSS/Open Files
  • From: "Tyle, Kevin R" <ktyle@xxxxxxxxxx>
  • Date: Wed, 6 May 2020 19:10:12 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=albany.edu; dmarc=pass action=none header.from=albany.edu; dkim=pass header.d=albany.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7dM/GjeA2yB/TT2dVUURdM3MhCtNpDff5nhB7vsWU50=; b=gvTQ3K4IcLnIePwPo1/BkQgztAd+kY9+Sr8+adlBPKpfrlTGF9ypiqb9ClfZG3oZA0Yjb8RfHOv31IMvmaTsHv6uQ///Hio1KljmAj7O2Eo5l9FuOxRpkQBDAEydXzn+KbomBU4Rn0pEzU5ZcF+EEAFm6Bo0C2awboDJbjGCZLxvyRRlbg1/HWPvyMOvZWzp6ENUjefgSEkFnU4EbOlKZzyYNxQIsBN6VD9aP263hSCSqzu8BSuV15P4e2GYBWX4z+JkVPGOYAsDwHgunY4NMK0OfEusd9KJ8U9iVvMePUReAzXMMA+Ltx+OoglM1r9AoYByEVd3q5x0w4sPVC1q/g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AAXv3GjUe9+DJwMqScwVHXVM/qLuYHTH+J79dvX9qz/8uk7c8W8eg21NdmwWR16KJMAg+V1x9sTqMO3hgtaDpbWYN1nyvsfdMKeKgK1lr+AzXTkRg9NQFu5Yv6RnZvzCCcqsVbV98AbH7v5UUoai5NYXA2Cu6lNWjneJPHHyS1mjEKpoYhDjZjyWNfA7/jk9gSxRNPXl2iG/SVR0MwvzwzEljKhIKc4ob51R/gm3kkclEK0BpxhDQqtuI5QH2mjVjleaPw37dmB9/RxTwXZAPsjCl9V1BDoN5JFcF6kW/WnvJoWRovFrLfWAH3AL65MbXS8INZk9DWWOQ5qOVv4aaQ==
  • Authentication-results: unidata.ucar.edu; dkim=none (message not signed) header.d=none;unidata.ucar.edu; dmarc=none action=none header.from=albany.edu;
Hi,

Our THREDDS server (http://thredds.atmos.albany.edu:8080/thredds , still 
running 4.6.13 at this time) serves both a current-week and longer term archive 
of GEMPAK-formatted METAR files as  Feature Collections. Very nicely, THREDDS 
invokes netcdf-java to handle the conversion of GEMPAK to NetCDF. The archive 
is accessed especially frequently at this time of the year, when my 
co-instructor and I have the students do a case study of their choice and use 
MetPy and Siphon to access, subset, and display surface maps and meteograms for 
their event of interest.

Typically, I soon run into issues where the THREDDS server fails with 500 
server errors when an arbitrary GEMPAK surface file gets accessed via NCSS. I 
have traced this to our NCSS and Random Access caches having max values set too 
low.

I see messages in the content/thredds/logs/cache.log file that look like this:

[2020-05-06T00:25:01.089+0000] FileCache NetcdfFileCache  cleanup couldnt 
remove enough to keep under the maximum= 150 due to locked files; currently at 
= 905
[2020-05-06T00:25:44.105+0000] FileCache RandomAccessFile cleanup couldnt 
remove enough to keep under the maximum= 500 due to locked files; currently at 
= 905

No prob, I have upped these limits now. But those "locked files" references 
made me do some poking around on the machine which is running THREDDS. I notice 
that when I run the lsof command and grep for one of the GEMPAK files that has 
been accessed, I see a really large # of matches.

For example, just now I picked one particular file, ran my Jupyter notebook on 
it that queries and returns the subsetted data via Siphon, and then ran lsof 
and grepped specifically for that one file.

Not surprisingly, it was listed in the lsof output. But surprisingly, lsof had 
it listed 89 times! Why might that be the case?

Multiply this by a dozen or so students and co-instructors, and 1-4 individual 
GEMPAK files per case, and now I'm seeing why I consistently run into issues, 
particularly with these types of datasets. Once the notebook instance is 
closed, the open files disappear from lsof, but often times students (and even 
I) forget to close and halt their Jupyter notebooks.

Curiously, when I look into my content/thredds/cache/ncss directory, I don't 
see anything.

So my two questions are:


  1.  Why does lsof return such a large number of duplicate references for a 
single file that's being accessed via NCSS?
  2.  Why do I not see files appear in the cache directory, even when there are 
clearly instances when the cache scouring script detects them?

Thanks,

Kevin

_____________________________________________
Kevin Tyle, M.S.; Manager of Departmental Computing
NSF XSEDE Campus Champion
Dept. of Atmospheric & Environmental Sciences
University at Albany
Earth Science 228, 1400 Washington Avenue
Albany, NY 12222
Email: ktyle@xxxxxxxxxx<mailto:ktyle@xxxxxxxxxx>
Phone: 518-442-4578
_____________________________________________

  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: