[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #ZIZ-818863]: Thredds inflation of data



We use the Apache Httpclient system 
(http://hc.apache.org/httpcomponents-client-4.5.x/)
so the fix will need to be with respect to that.

My speculation is that there are two related issues that need investigation.
1. chunking - 1 large response is chunked into multiple smaller chunks on the
   server side that are then reassembled on the client side.
2. A specific compressor -- GzipCompressingEntity, I think -- is used to do the
   actual compression on the server side.

I do not know the order in which these are used by the server side. It is 
possible
that the compressor operates first and then the chunker divides that compressed 
output.
It is also possible that the chunker is first and that the compressor operates 
on each
separate chunk.

We will need to investigate to see which is the case (if either) and then 
figure out
how to change the chunking and/or the compression parameters.  I suspect that 
sending
very large (1.6m) chunks is a bad idea. So, I would hope we set things up so 
that
the compression is first and the compressed data is then chunked.
Note that this will also require a corresponding change on the client side.

In any case, this is going to take a while for me to figure it out.


=======================
> I am currently trying to get a compression accelerator to work with Thredds, 
> with the aim to reduce the CPU time that Thredds spends decompressing chunks 
> of data.  The compression card plugs straight in to IBM Java8 and 
> java.util.zip with no changes needed to the application that uses it.  
> However, the replacement code with always revert to software inflation when 
> the size of the data passed to java.util.zip is less than 16384 bytes.
> 
> Our data is chunked and compressed, with the data we want to retrieve ending 
> up as 70  ~1.6MB chunks (compressed), which should inflate to ~7MB each (see 
> the hdls.txt file for more detail).
> 
> When requesting the data, I use the following URL to request a single chunk 
> (our applications run through selecting each of the 70 chunks sequentially 
> one at a time, in this example I'm just picking one chunk.
> 
> http://dvtds02-zvopaph2:8080/thredds/dodsC/decoupler/mhtest/original.nc.ascii?air_temperature[0][43][0:1:1151][0:1:1535<http://dvtds02-zvopaph2:8080/thredds/dodsC/decoupler/mhtest/original.nc.ascii?air_temperature%5b0%5d%5b43%5d%5b0:1:1151%5d%5b0:1:1535>]
> 
> While I expect there may be a few smaller inflate operations at the start/end 
> of the request, I'd expect that there would be a single 1.6MB --> 7MB inflate 
> request in there.  Instead of this, in the compression software logs, I see 
> 1000's of 512byte inflate requests, which as they are smaller than the min 
> 16384byte limit the compression card has, never get passed to the compression 
> card.
> 
> e.g.
> 
> 2017-10-23T12:55:31.643894+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b198 avail_in=512 next_out=0x9b917b3b8 
> avail_out=9200 total_in=2048 total_out=31827 crc/adler=1b38b342
> 2017-10-23T12:55:31.644229+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917c422 
> avail_out=4998 total_in=2560 total_out=36029 crc/adler=e835c747 rc=0
> 2017-10-23T12:55:31.644541+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=2560 total_out=36029 crc/adler=e835c747
> 2017-10-23T12:55:31.644909+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=2560 total_out=36029 crc/adler=e835c747 rc=-5
> 2017-10-23T12:55:31.645234+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b198 avail_in=512 next_out=0x9b917b3b8 
> avail_out=9200 total_in=2560 total_out=36029 crc/adler=e835c747
> 2017-10-23T12:55:31.645568+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917c47a 
> avail_out=4910 total_in=3072 total_out=40319 crc/adler=f1a70cdc rc=0
> 2017-10-23T12:55:31.645879+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=3072 total_out=40319 crc/adler=f1a70cdc
> 2017-10-23T12:55:31.646199+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=3072 total_out=40319 crc/adler=f1a70cdc rc=-5
> 2017-10-23T12:55:31.646511+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b198 avail_in=512 next_out=0x9b917b3b8 
> avail_out=9200 total_in=3072 total_out=40319 crc/adler=f1a70cdc
> 2017-10-23T12:55:31.646847+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917c272 
> avail_out=5430 total_in=3584 total_out=44089 crc/adler=8dba79f4 rc=0
> 2017-10-23T12:55:31.647166+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
> inflate:   flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=3584 total_out=44089 crc/adler=8dba79f4
> 2017-10-23T12:55:31.647490+00:00 dvtds02-zvopaph2 server: ### [0x3ff183456a0] 
>            flush=1 next_in=0x9b917b398 avail_in=0 next_out=0x9b917b3b8 
> avail_out=9200 total_in=3584 total_out=44089 crc/adler=8dba79f4 rc=-5
> 
> Happy to send across the datafile I'm using as an example, please let me know 
> if you need any other info.
> 
> Thanks
> 
> Martyn
> 
> Martyn Hunt
> Technical Lead, Mainframe
> Met Office  FitzRoy Road  Exeter  Devon  EX1 3PB  United Kingdom
> Tel: +44 (0)1392 884897
> Email: address@hidden<mailto:address@hidden>  Website: 
> www.metoffice.gov.uk<http://www.metoffice.gov.uk>
> 
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: ZIZ-818863
Department: Support THREDDS
Priority: Normal
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.