[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #MKN-277053]: New LDM server - space to allocate?



Hi Chris,

re:
> We are creating a new VM to exclusively run the LDM and store data.  What
> is a reasonable storage allocation for the VM, assuming we keep the data a
> reasonable amount of time (~2 weeks) and that we request what are
> considered "normal" model, satellite, and other data feeds?
> 
> Is there a way I can ballpark this?

"normal" covers a LOT of ground, so it is hard to be specific.

One way that you could get a ballpark estimate for the disk space
you will need is to look at the volumes of the various LDM/IDD feeds
that you would be REQUESTing.

Here is one way of doing that:

- get a listing of feed volumes from one of the real-server backend
  machines that comprises the idd.unidata.ucar.edu cluster:

  https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/siteindex

    node5.unidata.ucar.edu
    
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/siteindex?node5.unidata.ucar.edu

      Cumulative volume summary
      
https://rtstats.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?node5.unidata.ucar.edu

      Data Volume Summary for node5.unidata.ucar.edu

Maximum hourly volume  87849.748 M bytes/hour
Average hourly volume  54352.219 M bytes/hour

Average products per hour     449703 prods/hour

Feed                           Average             Maximum     Products
                     (M byte/hour)            (M byte/hour)   number/hour
SATELLITE             14228.577    [ 26.178%]    19361.034     6309.826
CONDUIT               11166.015    [ 20.544%]    34298.614   105307.152
NGRID                 10188.881    [ 18.746%]    14965.667    68238.717
NOTHER                 6079.393    [ 11.185%]     9394.044    12007.565
NIMAGE                 5733.486    [ 10.549%]     8584.315     5996.087
NEXRAD2                3778.526    [  6.952%]     4321.491    70816.870
HDS                    1259.123    [  2.317%]     1695.630    38361.870
NEXRAD3                1257.424    [  2.313%]     1435.883    93302.739
GEM                     351.352    [  0.646%]     2459.380     2163.587
IDS|DDPLUS              109.889    [  0.202%]      124.249    46573.130
UNIWISC                  97.565    [  0.180%]      140.039       50.370
FNEXRAD                  90.463    [  0.166%]      105.055      104.652
EXP                      10.522    [  0.019%]       16.230      113.087
LIGHTNING                 0.901    [  0.002%]        1.509      356.609
GPS                       0.104    [  0.000%]        1.052        1.022

Depending on what your use for the data is (i.e., which display and 
visualization
package you want to use), you will REQUEST different sets of the feeds
listed above.

If you are a GEMPAK user, and you want to get GOES-16/17 satellite data
that is usable in GEMPAK, you will want to REQUEST the NIMAGE feed.  You
will, for sure, want all of the IDS|DDPLUS feed, and I assume that you
will want all of the NEXRAD level 3 national composite
imagery in the FNEXRAD feed.  You may well also want all of the imagery
in the UNIWISC feed.

The biggest question question then becomes which of the feeds
that contain model output will you want/need?  The feeds in question
are HDS, NGRID, GEM and CONDUIT.  As you can see from the hourly
volumes, CONDUIT is a LARGE feed with up to 34 GB in some hours.
NGRID is smaller, but it does not have the 0.25 degree GFS model
output.

As a guesstimate, lets assume that you will want to get the
following set of feeds:

CONDUIT               11166.015    [ 20.544%]    34298.614   105307.152
NGRID                 10188.881    [ 18.746%]    14965.667    68238.717
NIMAGE                 5733.486    [ 10.549%]     8584.315     5996.087
HDS                    1259.123    [  2.317%]     1695.630    38361.870
GEM                     351.352    [  0.646%]     2459.380     2163.587
IDS|DDPLUS              109.889    [  0.202%]      124.249    46573.130
UNIWISC                  97.565    [  0.180%]      140.039       50.370
FNEXRAD                  90.463    [  0.166%]      105.055      104.652
LIGHTNING                 0.901    [  0.002%]        1.509      356.609

The total volume for this set is on the order of 29-30 GB per hour.
So, if you wanted to keep two weeks of all of the data, you would need:

14 * 24 * 30 GB, or about 10 TB of disk to store the raw data.  Now, if
you are going to only be using GEMPAK, you would not be keeping the raw
around after decoding into GEMPAK format files, so the amount of disk
used would be less.

As a quick comparison, we keep about 2 weeks of all data REQUESTed
and processed on our servers (lead.unidata.ucar.edu and atm.ucar.edu)
and we are using 24 TB of disk.  But, we are also keeping raw data and
data decoded into other formats (e.g., McIDAS), so our use is way more
than what you would likely need/want.

I hope that the above was somewhat useful, but it is very "hand wavy"
at best.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: MKN-277053
Department: Support LDM
Priority: Normal
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.