Re: [netcdfgroup] nc_open takes a long time to open a big file

Thank you all for your helpful comments. Please see my replies inline below ...

On Sep 9, 2010, at 10:12 PM, J Glassy wrote:

Hello Jennifer,

   If your chunk parameters and dataset dimensions remain as you've
originally stated (1,1,160,320,etc), I suspect these could be playing
an indirect role in the long file open times reported. In HDF5 (upon
which NetCDF is built) these are typically set to a more 'square' I/O
geometry, and rarely assigned to be even multiple of the dataset's
dimensions.

I would not agree that I have chunked my 4D grid in an unusual way. I have optimized chunk size for the way that my software (GrADS) does I/ O, and a lat/lon grid (a single chunk, based on the inner two varying dimensions) is the basic element of the data set, the most common way the users interact with the data.

     This is a fairly large file (18404502496), and we don't know how
much memory (or swap) you have available nor anything about I/O bus
contention; any of these things can play a role in responsiveness.
      Opening any file triggers a lot of downstream activity by the
OS, including inode traverses, queries on current state of cache
condition, lock checking, wait-states for re-tries on corrupted
sectors, etc.  If there are a very large number of physical files
present in the directory where this file resides, or there is a lot of
contention from concurrent processes on the inode table for the
filesystem, these factors can also translate to longer file open
times.

I have done testing on several different but similarly-configured boxes, sometimes with no other users logged in, and no competition for access to the files. We use the gluster file system. The available memory on these boxes is 24Gb. I don't think it's fair to blame the OS or the hardware for this.

      Closing a file can be even more involved, since buffer flushes
have to be performed through several layers of cache, VMM tables --
how long does it take to close the same file (e.g. does this also take
a longer than expected?).

Closing the file takes no time at all, especially in my testing program that doesn't do anything other than open the file.


      Last-- I haven't checked the internals of nc_open() lately, but
sometimes multi-tier, wrapped APIs (like NetCDF)  bundle of other
'look ahead' initialization actions along with the 'opening' of a
file, such as building or traversing large in-memory indices of
internal node trees, skipping over gaps in the internal structures...
if nc_open() triggers this sort of extra-curricular mechanism, wall
clock time to perform the open could increase somewhat linearly with
the size of the file.

Based on the suggestion of another post in this thread, I have tried a test where I open two similar files, neither one having been touched in a while, one with the HDF5 API, the other with the NetCDF4 API. The HDF5 API takes a second or less, NetCDF4, still ~20 seconds. Here's the C program:

#include "hdf5.h"
#include "netcdf.h"
#include <stdlib.h>
#include <string.h>
#include <time.h>
void get_time();
int main(int argc, char *argv[]) {
  hid_t h5id,fapl;
  int ncid;
  size_t a,b;
  float c;

  get_time();
  /* open a file with HDF5 library */
  if ((fapl = H5Pcreate(H5P_FILE_ACCESS))>=0) {
    if ((H5Pset_fclose_degree(fapl,H5F_CLOSE_STRONG))>=0) {
      if ((h5id = H5Fopen(argv[2], H5F_ACC_RDONLY, fapl))>=0) {
        H5Pclose(fapl);
        printf("H5Fopen succeeded: h5id=%ld\n",(long)h5id);
      }
    }
  }
  get_time();
  /* open a file with NetCDF4 library */
  a=4096000; b=51203; c=0.75;
  nc_set_chunk_cache(a,b,c);
  if ((nc_open(argv[1], NC_NOWRITE, &ncid))==NC_NOERR)
    printf("nc_open succeeded: ncid=%d\n",ncid);
  get_time();
  return 0;
}

/* print the current time in hr:min:sec */
void get_time() {
  time_t *tsec;
  time_t tt;
  struct tm *tm;

  tt = time((long *)0);
  tsec = &tt;
  tm = gmtime(tsec);
  printf("%02i:%02i:%02i\n",tm->tm_hour,tm->tm_min,tm->tm_sec);
}

Here's a sample of the output:

14:04:28
H5Fopen succeeded: h5id=16777216
14:04:29
nc_open succeeded: ncid=65536
14:04:52

I'm not ready to give up in resignation; this should not be a 'feature' of the NetCDF API that punishes me because I have big files.

--Jennifer


--
Jennifer M. Adams
IGES/COLA
4041 Powder Mill Road, Suite 302
Calverton, MD 20705
jma@xxxxxxxxxxxxx



  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: