[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #AVS-567793]: String truncation when accessing file through THREDDS server



I had this ticket moved to support-netcdf since it is really an
issue with the netcdf-c library and not with the Thredds server.

As you probably know by now, the maxStrlen dimension only
occurs when you are accessing data using the opendap (dap2)
protocol using the ncdump from the netcdf-c library.

You can set the size of maxstrlen to a larger value
but not get rid of it altogether.
You can control it by either of two changes to the url:
1. prefix the url with "[stringlength=256] to get something like this:
[stringlength=256]http://host:port/thredds/dodsC/.../file.nc
or
2. suffix the url with "#stringlength=256" to get something like this:
http://host:port/thredds/dodsC/.../file.nc#stringlength=256

The 256 is just an example, you can set it to whatever you want.

You also can set it on a per-variable basis by using
"stringlength_<variablename>" such as "stringlength_air_temp"

This is documented, but for historical reasons, the parameter
is called "stringlength" when added to the url and not
maxStrLen. I should consider adding masStrlen as an alias.

But there is a bug in handling of parameters that was introduced
relatively recently. It caused these parameters to be ignored.
The fix for that is now in master. It was github pull request:
https://github.com/Unidata/netcdf-c/pull/570 

> 
> I have a netcdf file containing (among other things) some long character
> variables with dimension (table_rows, max_nchar). The 'max_nchar' dimension
> of length 175 allows me to store strings of the appropriate length (in this
> example they are filenames).
> 
> My problem is that when accessing this file via a THREDDS server, the
> 'max_nchar' dimension is replaced with 'maxStrlen=64' in variables, leading
> to truncation of my strings. This does not happen if I access the same file
> locally.
> 
> Details are provided below (and you should be able to run the second
> example -- but not the first which requires local file access. However,
> they refer to the same file).
> 
> My question is whether I can prevent the strings being truncated to 64
> characters when accessing remotely?
> 
> I see that the followiing link mentions it is possible to control the
> default string length -- but doesn't explain how:
> http://www.unidata.ucar.edu/software/netcdf/docs/dap_accessing_data.html
> 
> Thanks,
> Gareth Davies.
> 
> 
> #
> # This shows the 'normal' output of ncdump when I am accessing the file
> locally. All the output is correct -- this is what I would like the THREDDS
> output to be like.
> # Notice how the 'initial_condition_file' variable is correctly printed (in
> the next example, it will be truncated)
> #
> ncdump -v initial_condition_file
> /g/data/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/TSUNAMI_EVENTS/
> unit_source_statistics_sunda.nc | head -n35
> netcdf unit_source_statistics_sunda {
> dimensions:
> table_rows = UNLIMITED ; // (492 currently)
> max_nchar = 175 ;
> variables:
> int table_rows(table_rows) ;
> table_rows:long_name = "dimension for rows of table" ;
> double lon_c(table_rows) ;
> double lat_c(table_rows) ;
> double depth(table_rows) ;
> double strike(table_rows) ;
> double dip(table_rows) ;
> int rake(table_rows) ;
> int slip(table_rows) ;
> double length(table_rows) ;
> double width(table_rows) ;
> int downdip_number(table_rows) ;
> int alongstrike_number(table_rows) ;
> int subfault_number(table_rows) ;
> double max_depth(table_rows) ;
> int max_nchar(max_nchar) ;
> max_nchar:long_name = "dimension giving maximum number of
> string characters" ;
> char initial_condition_file(table_rows, max_nchar) ;
> char tide_gauge_file(table_rows, max_nchar) ;
> 
> // global attributes:
> :discretized_sources_file =
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/all_discretized_sources.RDS"
> ;
> :parent_script_name =
> "/short/w85/tsunami/MODELS/AustPTHA_c/SOURCE_ZONES/sunda/TSUNAMI_EVENTS/make_all_earthquake_events.R"
> ;
> :R_session_info = "R version 3.3.0 (2016-05-03) ; Platform:
> x86_64-pc-linux-gnu (64-bit) ; Running under: CentOS release 6.9 (Final) ;
> ; locale: ; [1] C ;  ; attached base packages: ; [1] methods   stats
> graphics  grDevices utils     datasets  base      ;  ; other attached
> packages: ;  [1] rptha_0.0.79     ncdf4_1.15       geometry_0.3-6
> magic_1.5-6      ;  [5] abind_1.4-3      minpack.lm_1.2-0 FNN_1.1
> raster_2.5-8     ;  [9] geosphere_1.5-5  rgdal_1.1-10     sp_1.2-3
> rgeos_0.3-21     ;  ; loaded via a namespace (and not attached): ; [1]
> Rcpp_0.12.5     grid_3.3.0      lattice_0.20-33" ;
> data:
> 
> initial_condition_file =
> 
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_source_data/sunda/sunda_1_1.tif",
> 
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_source_data/sunda/sunda_2_1.tif",
> 
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_source_data/sunda/sunda_3_1.tif",
> 
> 
> #
> # Here is the result of a remote access. We see that the max_nchar dimension
> # of 'initial_condition_file' has been replaced with maxStrlen=64. As a
> result,
> # the output variable is truncated. I would like to know how to prevent
> this.
> #
> ncdump -v initial_condition_file
> http://dapds00.nci.org.au/thredds/dodsC/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/TSUNAMI_EVENTS/unit_source_statistics_sunda.nc
> | head -n37
> netcdf unit_source_statistics_sunda {
> dimensions:
> table_rows = UNLIMITED ; // (492 currently)
> maxStrlen64 = 64 ;
> max_nchar = 175 ;
> variables:
> int table_rows(table_rows) ;
> table_rows:long_name = "dimension for rows of table" ;
> double lon_c(table_rows) ;
> double lat_c(table_rows) ;
> double depth(table_rows) ;
> double strike(table_rows) ;
> double dip(table_rows) ;
> int rake(table_rows) ;
> int slip(table_rows) ;
> double length(table_rows) ;
> double width(table_rows) ;
> int downdip_number(table_rows) ;
> int alongstrike_number(table_rows) ;
> int subfault_number(table_rows) ;
> double max_depth(table_rows) ;
> int max_nchar(max_nchar) ;
> max_nchar:long_name = "dimension giving maximum number of
> string characters" ;
> char initial_condition_file(table_rows, maxStrlen64) ;
> char tide_gauge_file(table_rows, maxStrlen64) ;
> 
> // global attributes:
> :discretized_sources_file =
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/all_discretized_sources.RDS"
> ;
> :parent_script_name =
> "/short/w85/tsunami/MODELS/AustPTHA_c/SOURCE_ZONES/sunda/TSUNAMI_EVENTS/make_all_earthquake_events.R"
> ;
> :R_session_info = "R version 3.3.0 (2016-05-03) ; Platform:
> x86_64-pc-linux-gnu (64-bit) ; Running under: CentOS release 6.9 (Final) ;
> ; locale: ; [1] C ;  ; attached base packages: ; [1] methods   stats
> graphics  grDevices utils     datasets  base      ;  ; other attached
> packages: ;  [1] rptha_0.0.79     ncdf4_1.15       geometry_0.3-6
> magic_1.5-6      ;  [5] abind_1.4-3      minpack.lm_1.2-0 FNN_1.1
> raster_2.5-8     ;  [9] geosphere_1.5-5  rgdal_1.1-10     sp_1.2-3
> rgeos_0.3-21     ;  ; loaded via a namespace (and not attached): ; [1]
> Rcpp_0.12.5     grid_3.3.0      lattice_0.20-33" ;
> :_DODS_Unlimited_Dimension = "table_rows" ;
> data:
> 
> initial_condition_file =
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_",
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_",
> "/g/data1a/fj6/PTHA/AustPTHA_1/SOURCE_ZONES/sunda/EQ_SOURCE/Unit_",
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: AVS-567793
Department: Support netCDF
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.