Re: [netcdf-hdf] [netcdfgroup] NetCDF: HDF error, and now what?

  • To: "'Ed Hartnett'" <ed@xxxxxxxxxxxxxxxx>
  • Subject: Re: [netcdf-hdf] [netcdfgroup] NetCDF: HDF error, and now what?
  • From: "John Urbanic" <urbanic@xxxxxxx>
  • Date: Tue, 25 Oct 2011 03:46:38 -0400
Ed:

After building with  --enable-logging (I cannot figure the Fortran API for
nc_set_log_level) I do indeed get meaningful error logging.  I get the below
sequence every time the put_var fails.  This error is sporadic in both the
variable affected as well as the file; about 90% of the files during this
particular run were just fine, and specific failure points will vary from
run to run.  All PE's reported this same error here (I have only included
one below), but often PEs succeed in writing even when others fail.

As the trace terminates with the fairly discouraging "major: Internal error
(too specific to document in detail)", I am at a loss.  I did take Hernan's
implied advice and built a 4.1.1 version, but it throws similar errors
(involving "decrementing file ID failed").

At this point, our netcdf conversion is at the mercy of your expert insight.
Actually, the conversion is done - now it is a question of whether we have
wasted our time...

Hopeful,
John

___________________________________________________________
HDF5-DIAG: Error detected in HDF5 (1.8.7) MPI-process 0:
  #000: H5Dio.c line 266 in H5Dwrite(): can't write data
    major: Dataset
    minor: Write failed
  #001: H5Dio.c line 671 in H5D_write(): can't write data
    major: Dataset
    minor: Write failed
  #002: H5Dcontig.c line 597 in H5D_contig_write(): contiguous write failed
    major: Dataset
    minor: Write failed
  #003: H5Dselect.c line 306 in H5D_select_write(): write error
    major: Dataspace
    minor: Write failed
  #004: H5Dselect.c line 217 in H5D_select_io(): write error
    major: Dataspace
  #005: H5Dcontig.c line 1225 in H5D_contig_writevv(): can't perform
vectorized read
    major: Dataset
    minor: Can't operate on object
  #006: H5V.c line 1454 in H5V_opvv(): can't perform operation
    major: Internal error (too specific to document in detail)
    minor: Can't operate on object
  #007: H5Dcontig.c line 1152 in H5D_contig_writevv_cb(): block write failed
    major: Dataset
    minor: Write failed
  #008: H5Fio.c line 158 in H5F_block_write(): write through metadata
accumulator failed
    major: Low-level I/O
    minor: Write failed
  #009: H5Faccum.c line 808 in H5F_accum_write(): file write failed
    major: Low-level I/O
    minor: Write failed
  #010: H5FDint.c line 185 in H5FD_write(): driver write request failed
    major: Virtual File Layer
    minor: Write failed
  #011: H5FDmpio.c line 1822 in H5FD_mpio_write(): MPI_File_write_at failed
    major: Internal error (too specific to document in detail)
    minor: Some MPI function failed
  #012: H5FDmpio.c line 1822 in H5FD_mpio_write(): I/O error
    major: Internal error (too specific to document in detail)
    minor: MPI Error String