[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: argo netCDF data



>To: address@hidden
>From: "shen yingshuo" <address@hidden>
>Subject: Re: 20020815: argo netCDF data
>Organization:   School of Ocean and Earth Science and Technology, University 
>of Hawaii
>Keywords: malloc failure, header corruption

Hi,

> recently we downloaded some argo netCDF from ifremer.  we tried to
> open it with several netCDF-able program.. but it did not work..
> would you please help us in finding out what is wrong with this
> dataset?
> 
>  i put one dataset for ftp
> 
> ftp apapane.soest.hawaii.edu  with anonymous as user
> cd /users/poppy
> get Q19000222.prof.nc

I downloaded the above file in binary mode and could reproduce the
problem here with ncdump, which produced the error message:

  $ ncdump -c ~/tmp/Q1900022_prof.nc 
  ncdump: /home/russ/tmp/Q1900022_prof.nc: Memory allocation (malloc) failure

Examining the file in a binary editor, it appears that a byte was
deleted from the header of the file somewhere before the byte that
represents the length of the "N_PARAM" dimension.  The effect of this
is to make the "N" byte appear as part of the length of a dimension
named "_PARAM...", which throws everything off after that, since the
netCDF library interprets this as specifying that the dimension name
has 1870 characters.  1870 is the decimal value of the hexadecimal
integer represented as "074e" in the hex dump below of the file
header:

 00000000: 4344 4601 0000 000a 0000 000a 0000 000e  CDF.............
 00000010: 0000 0009 4441 5445 5f54 494d 4500 0000  ....DATE_TIME...
 00000020: 0000 000e 0000 0009 5354 5249 4e47 3235  ........STRING25
 00000030: 3600 0000 0000 0100 0000 0008 5354 5249  6...........STRI
 00000040: 4e47 3634 0000 0040 0000 0008 5354 5249  address@hidden
 00000050: 4e47 3332 0000 0020 0000 0008 5354 5249  NG32... ....STRI
 00000060: 4e47 3136 0000 0010 0000 0007 5354 5249  NG16........STRI
 00000070: 4e47 3800 0000 0008 0000 0007 5354 5249  NG8.........STRI
 00000080: 4e47 3400 0000 0004 0000 0007 5354 5249  NG4.........STRI
 00000090: 4e47 3200 0000 0002 0000 0006 4e5f 5052  NG2.........N_PR
 000000a0: 4f46 0000 0000 0000 0000 074e 5f50 4152  OF.........N_PAR
 000000b0: 414d 0000 0000 0300 0000 084e 5f4c 4556  AM.........N_LEV
 000000c0: 454c 5300 0000 2b00 0000 0c4e 5f54 4543  ELS...+....N_TEC
 000000d0: 485f 5041 5241 4d00 0000 1900 0000 074e  H_PARAM........N
 000000e0: 5f43 414c 4942 0000 0000 0a00 0000 094e  _CALIB.........N
 000000f0: 5f48 4953 544f 5259 0000 0000 0000 0000  _HISTORY........
 00000100: 0000 0000 0000 0000 0000 0b00 0000 3500  ..............5.
 00000110: 0000 0944 4154 415f 5459 5045 0000 0000  ...DATA_TYPE....
 00000120: 0000 0100 0000 0400 0000 0c00 0000 0100  ................
 00000130: 0000 0763 6f6d 6d65 6e74 0000 0000 0200  ...comment......
 00000140: 0000 0944 6174 6120 7479 7065 0000 0000  ...Data type....
 00000150: 0000 0200 0000 1000 0028 1800 0000 0e46  .........(.....F
 00000160: 4f52 4d41 545f 5645 5253 494f 4e00 0000  ORMAT_VERSION...
  ...
 
If I insert a single 0 byte before the length of the "N_PARAM" name
(around the 168th byte), all the dimensions get read in OK, then the
global attributes are read in, then the first few variables.

(You can parse binary data like this by following the "Appendix B File
Format Specification" in the netCDF User's Guide.)

A similar error occurs some time after reading in the header
information for the "DATA_CENTRE" variable.  Looking at the header, it
appears another byte has been deleted before the length of the
DATE_CREATION variable string, so that the first character "D" is
interpreted as part of the length of the "ATE_CREATION" variable,
which makes the length wrong.  Inserting a byte for the right length
of the "DATE_CREATION" name results in getting further, where the
"WMO_INST_TYPE" variable again has a byte missing somewhere before the
name.

It appears as though additional single bytes have been deleted in the
header subsequently.

To diagnose the cause of this problem, it would be useful to know
something about how this file was created and what subsequent
processing occurred.  You could narrow down where the problem occurred
by using something like "ncdump -h" or "ncdump -c" on the file when it
is first created and subsequently at each stage of copying, moving, or
processing it to determine exactly what process deleted the bytes from
the header.

If the file was created this way, the program that created the file is
suspect.  If you suspect the problem is in the netCDF library and can
provide us with an example of a program that creates a file with such
a corrupt header, we would be very interested in trying to reproduce
and fix the problem.  But I've never encountered a case of the netCDF
library creating corrupt headers with a few missing bytes.  It seems
to me more likely that this is a symptom of a processing problem or a
hardware error.  If the file is the result of sending bytes over a
communication channel with no error checking, for example, it may just
be caused by dropped bytes in the communications channel.

Please let us know if you find anything further about the cause of
this ...

--Russ

_____________________________________________________________________

Russ Rew                                         UCAR Unidata Program
address@hidden                     http://www.unidata.ucar.edu