[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[IDD #TTR-495389]: Slow access to local datasets through ADDE
- To: zehel@xxxxxxx
- Subject: [IDD #TTR-495389]: Slow access to local datasets through ADDE
- From: "Unidata McIDAS Support" <support-mcidas@xxxxxxxxxxxxxxxx>
- Date: Tue, 05 Feb 2008 16:24:21 -0700
- Delivered-to: support-mcidas@unidata.ucar.edu by laraine.unidata.ucar.edu (Postfix) with ESMTP id 60AD0CB18D; Tue, 5 Feb 2008 16:24:21 -0700 (MST) id 43EC4D5071; Tue, 5 Feb 2008 16:24:21 -0700 (MST)
Hi Samuel,
re:
> O.K. How should I start looking into how I can divide the datasets? Is there a tutorial
> on this kind of thing?
No, there is no tutorial on organizing datasets to improve ADDE access mainly since the
guiding principle is straightforward: the speed of access to datasets of TYPE=IMAGE
is directly proportional to the number of images in the dataset.
The typical Unidata site running McIDAS keeps about one day or less of any type of
image online. For the Visible images in the IDD NIMAGE datastream, this would mean
keeping having 96 images in the Vis datasets GINIEAST/GE1KVIS and/or GINIWEST/GW1KVIS.
Sites wanting to keep more than this have typically done so by FILEing the images into
a directory hierarchy where each day's images are in a single directory. This is the
reason I added the "replaceable" \CURDAY to the expressions that can be used in the
DSSERVE DIRFILE= keyword. \CURDAY gets expanded to the current date expressed as CCYYMMDD
where:
CC -> century
YY -> year
MM -> month
DD -> day
Examples:
20080205
20071231
etc.
So, my recommendation is to create/modify your LDM pqact.conf action(s) to FILE or decode the
images into a directory structure where images are organized at some level by their
date. It is also convenient to organize by image type: VIS, IR, WV, etc.
The absolute worst thing to do as far as performance is concerned is to FILE/decode
all images into a single output directory. This is bad for ADDE access to the data
AND for operating system management of the data.
> P.S. I've noticed something interesting about our GINIEAST/GINI 1km VIS East CONUS data. The
> most recent images are sometimes 1 hour behind, and sometimes 15 minutes behind. Is this due
> to shu.cup.edu being close to memory overload?
Since the real time stats shu.cup.edu is reporting for NIMAGE images is very low:
http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?NIMAGE+shu.cup.edu
it is likely that it is taking a long time to process data out of your LDM queue. The
solution for situations like this is to divide LDM pqact processing into several
pqact invocations. If the LDM setup on shu is more-or-less the same as it was
when I help setup McIDAS, you shouldn't be experiencing slow processing of received
products out of the LDM queue. A system being bogged down due to I/O waits, etc.
could cause slow processing of newly received products out of the LDM queue. I have
seen installations have this kind of problem when a LOT of files are being continually
written to disk AND lots of scouring of older data is occuring. Given your various notes
about trying to keep lots of data online, I suspect that your system I/O is the
cause of the slow processing of newly received products.
> Oh, I've also noticed that occasionally we're missing GINI East VIS 1km image (e.g. add.ucar.edu
> has 2008-02-05 19:32:24Z, but shu.cup.edu does not). Again, might this be due to memory overload?
If you are routinely seeing gaps in image filing, it is possible that you are ingesting
data at a much faster rate than you are able to process. In this case, products that have
been received might get overwritten while in the LDM queue before a pqact action can
do something (e.g., FILE it) to it. In this case your options are to tune the processing
of data out of your LDM (by splitting pqact tasks among multiple pqact invocations); increase
your LDM queue size (this option is limited by the amount of physical RAM you have and
machine's architecture (32-bit vs 64-bit)); or to request less data from upstream sites.
The option you choose will depend on what your objectives are.
I see that you are receiving 1.3 GB of data per hour on average with maximums of up
to 3.5 GB per hour:
http://www.unidata.ucar.edu/cgi-bin/rtstats/rtstats_summary_volume?shu.cup.edu
I seem to remember that you have 1.5 GB of RAM on shu (true?), and you have an LDM
queue that is 1 or 1.5 GB in size (if not, what size is your LDM queue currently?).
The periods when you are receiving data at a rate of 3.5 GB/hour will result in
a queue residency time of between 17 and 25 minutes. If your system is bogged down
by I/O waits, it would seem likely that you would not be able to process products already
received before they get overwritten by newly received products.
Cheers,
Tom
****************************************************************************
Unidata User Support UCAR Unidata Program
(303) 497-8642 P.O. Box 3000
support@xxxxxxxxxxxxxxxx Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage http://www.unidata.ucar.edu
****************************************************************************
Ticket Details
===================
Ticket ID: TTR-495389
Department: Support McIDAS
Priority: Normal
Status: Closed