Update on 13.2.1 Grib Decoder Threads

Since the last update, which involved testing only the ingest and decoding of CONDUIT 0.5/2.5 degree GFS, I've opened up the NGRID and NEXRAD3 feeds, as well as text and satellite products from the WMO and NIMAGE feeds, respectively.

 The goal is to compare the speed of the grid decoder on high-resolution CONDUIT GFS runs alone versus running in parallel with the full nationwide NEXRAD3 feed and other products.

 So far so good.  I was using 8 grid decoder threads up until roughly April 26 0000 UTC, after which I increased the count to 12.  While there is a noticeable decrease in the decoding latency (maxing out in the 1000-2000 second range rather than 2000-3000 seconds), this could just as well be caused by a reduction in file size of the GFS given the absence of activity in the atmosphere since April 26 compared to the previous week.

  

 Regardless of the reason, we have shown that the payoff from increasing the thread count to 12 from 8 is diminishing: we're being limited more by the Raw Data Store write times than we are by the time it takes the grid decoder to process files.

Even so, the total decoding time is good: if 1500 seconds (25 minutes) is the longest latency time for the high-resolution GFS, we're in pretty good shape.  For comparison, using GEMPAK grib2 decoders with the LDM, the typical setup for GEMPAK users these days, the entire 0.5 degree GFS run takes about 70 minutes to be fully decoded and available on disk.  For AWIPS II EDEX, this time is roughly 90 minutes.  That is encouraging. 

Testing 13.2.1 Unified Grib Decoder on CONDUIT GFS

Last month we received the first version of AWIPS II which included the new unified grib decoder (13.1.2). The install procedure for 13.1.2 was more complicated than usual - we needed the full 13.1.1 installation plus a 13.1.2 "update" - so around 8 GB of RPMs to manage. 

If you're unfamiliar with what the unified grib decoder is, here's a quick rundown: before 13.1.2, the D2D perspective (for WFOs) and the National Centers Perspective (for NCEP centers) required separate data decoders and database tables for grib messages. D2D used a decoder called grib, while NCP used a decoder called ncgrib. If you didn't want to bog down your system, you could only run one at a time, meaning: depending on your server configuration, gridded model data would only be visible in one perspective, not both.

This first version of the unified grib decoder had a number of problems, likely from the complicated installation procedure (we do it a little differently here compared to a operational forecast office).

Now with the full AWIPS II 13.2.1 release from the NWS, I can finally test our method for increasing the number of decoder threads on CONDUIT data. This is the same method used to handle ingest of the entire NEXRAD3 feed (~190 radar sites), and is described in detail on the page linked at the bottom of this post.

As delivered, AWIPS II will take roughly 4 hours to decode the high-resolution CONDUIT 0.5 degree GFS with 4 threads, and the message broker bottleneck would get worse with products simultaneously being decoded by other plugins (obs, satellite, radar, etc.)

Unidata's solution to this is to increase the allocated number of threads for the grid (formerly called grib) decoder, a task which involves either re-building the EDEX core RPMs, or editing the already-installed plugin-specific jar archives.    I managed to incorporate the new decoder thread settings into an add-on RPM which I have added to the Unidata AWIPS II release (not available to the public at this time, sorry).

Initial results are promising: the total time to ingest and decode the 0.5 degree GFS on CONDUIT (25k files, ~3.6 GB) was just over 1 hour.


I let this run for a few days to make sure the Qpid message queue remained active as in the past I've noticed high-volume grib message decoding would sometimes bottle-neck the system even though the dataflow through-put was low.  Here, it seems, the system can more than handle the 0.5/2.5 degree CONDUIT GFS.

Of course ingesting CONDUIT grids alone is one thing, ingesting them alongside point, satellite and radar data is another. That's the next step.

For more info on modifying the number of decoder threads, see: http://www.unidata.ucar.edu/staff/mjames/awips2/docs/threads.html

-Michael

Introducing IDV 5.0 - Lynx!

Introducing IDV 5.0 - code name Lynx![Read More]

Chunking Data: Choosing Shapes

In part 1, we explained what data chunking is about in the context of scientific data access libraries such as netCDF-4 and HDF5, presented a 38 GB 3-dimensional dataset as a motivating example, discussed benefits of chunking, and showed with some benchmarks what a huge difference chunk shapes can make in balancing read times for data that will be accessed in multiple ways.

In this post, I'll continue looking at that example dataset to see how we can derive good chunk shapes, generalize to other datasets, look at how long it can take to rechunk a multidimensional dataset, and look at the use of Solid State Disk (SSD) for both accessing and rechunking data.


[Read More]

Chunking Data: Why it Matters

What is data chunking? How can chunking help to organize large multidimensional datasets for both fast and flexible data access?  How should chunk shapes and sizes be chosen?  Can software such as netCDF-4 or HDF5 provide better defaults for chunking? If you're interested in those questions and some of the issues they raise, read on ...

[Read More]
Unidata Developer's Blog
A weblog about software development by Unidata developers*
Unidata Developer's Blog
A weblog about software development by Unidata developers*

Welcome

FAQs

Developers' blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
« May 2013
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today