[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NetCDF Java Read API



Hi Ai-Hoa:

Sanh, Ai-Hoa wrote:
> 1) I am not using the latest and greatest Java 4.0 jar. I'm using one that I 
> downloaded in September when I was adding the NetCDF 4 subsetter for Oliver. 
> This is the md5sum of the jar that I have: 0e0eb9766000258fc25162647ccf966e.

it is good to get latest, since many bugs are being fixed. but i mostly wanted 
to make sure you arent using the 2.2 version.

> 
> I can try out the code with the latest jar.
> 
> 2) The code we got to work does read the entire array. However, I was having 
> trouble using the Variable.slice and Variable.section(List<Range>) methods to 
> read just one of the forecasts. Each time, I still ran into an "Out of 
> Memory" error. And when I debugged the code, and inspected the last line 
> attempted, there were count and buffer variables with values in the 
> 311000000's. So either I'm not using the methods correctly, or the code is 
> still trying to work with the entire array.

as i mentioned previously, the file doesnt have any chunking, and since its 
compressed, the entire thing has to be read at once, then uncompressed, then 
subsetted. 

The entire array is 24 * 3520 * 5120 = 438M shorts = 865 Mbytes. When 
compressed, its on the order of the file size, 45 Mbytes. Im guessing that 
after decompressing, we then have to copy to an array of shorts, which doubles 
the amount of memory needed. I will investigate to see if we can eliminate that 
2X memory, assuming thats the problem.

In any case, you will want to get the chunking corrected.

> 
> I can try the code on a 64-bit platform to see if the subsetting will work if 
> I give the code 5G of memory.

Just to make sure, what options are you giving to the JVM from the command line 
(ie -Xmx ?)


> 
> Ai-Hoa
> 
> -----Original Message-----
> From: Rappa, Greg [mailto:address@hidden] 
> Sent: Friday, November 14, 2008 10:14 PM
> To: John Caron
> Cc: 'Rob Weingruber'; Ethan Davis; 'Dennis Heimbigner'; Sanh, Ai-Hoa; 'Russ 
> Rew'; Moser, William; Newell, Oliver; Unidata netCDF Java Support
> Subject: Re: NetCDF Java Read API
> 
> John Caron wrote:
>> Hi Greg:
>>
>> 1) what version of netcdf-java are you using?
>>   
> Hi John,
> 
> Ai-Hoa was actually running the Java code for me, but I think she
> told me she's running the latest Java 4.0.  Ai-Hoa can tell you for sure.
> 
> If it helps, our 'java -version' returns:
> java version "1.5.0_09"
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b01)
> Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_09-b01, mixed mode)
> 
>> 2) how much memory do you give the JVM (-Xmx option). the default can be as 
>> low as 32 Megs.
>>   
> Running our Java reader on that sample file with a max memory of
> 4096M failed whereas running with 5096M succeeded.  That test
> was done on a Linux box with a 'uname -a' response as follows:
> Linux paris 2.6.9-78.0.1.ELsmp #1 SMP Tue Jul 22 18:01:05 EDT \
> 2008 x86_64 x86_64 x86_64 GNU/Linux
>> 3) are you reading the entire data into memory?
>>   
> I believe so.  I recall seeing that the Java code invoked a read()
> which returned an 'Array'.  Ai-Hoa could probably give you more
> details on Monday.
> 
>> It doesnt appear that you are correctly chunking. an h5dump -h shows: 
>>
>>   DATASET "VIL" {
>>       DATATYPE  H5T_STD_I16LE
>>       DATASPACE  SIMPLE { ( 24, 1, 3520, 5120 ) / ( 24, 1, 3520, 5120 ) }
>>       ATTRIBUTE "DIMENSION_LIST" {
>>          DATATYPE  H5T_VLEN { H5T_REFERENCE}
>>          DATASPACE  SIMPLE { ( 4 ) / ( 4 ) }
>>       }
>>
>> chunking on ( 1, 1, 3520, 5120 ) would probably be a reasonable thing to do, 
>> depending on how you plan to access the data.
> I'm not really familiar enough with the HDF5 file content to know what
> to look for in the h5dump output regarding chunking size.  What part of
> the h5dump indicates the chunking size?
> 
> Regarding access to the files, as I mentioned earlier, I write the entire
> set of grids in one C++ call to NcVar::put.  This has always resulted in
> fast write times.  Likewise, I expected data to be acquired with one read
> but would be willing to read each of the 24 forecast grids if that is what
> it takes.
> 
> Greg.
>