[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #NKX-533675]: Performances issues with large ldm queues



Baudouin,

> Most of the
> time is spent between mmap'ing the file containing the product to send
> (mmap fiel descritor 4) and the next system call (rt_sigprocmask). The
> system calls themselves are fast.  My guess is that we are seeing poor
> disk performances. As the file in mmap'ed instead of (f)read, we don't
> see the I/Os in the trace. But computing the md5 on the file will
> force the mapped pages to be read from disk into memory (thus the
> hidden I/Os). The time between the two system calls is 1/3 of a
> second, which is consistent with the rate of 3 products a second that
> is observed.

The MD5 algorithm is computationally expensive -- so it's entirely
reasonable for it to take about one-third of a second on a 430 KB file
(that rate is comparable to what we've seen for GINI imagery).

One solution would be to compute the MD5 checksum on something much 
smaller than the file, but that is still unique to that data-product.
I notice that the name of the file

    z_tigge_c_ecmf_20070119120000_glob_prod_cf_pl_0000_000_1000_u.grib

is quite complicated.  Is it unique to that data-product?  If so, then
I can add an option to pqinsert(1) so that it only uses the product-
identifier (i.e., filename) in computing the MD5 checksum.  That would
be much faster and should greatly increase the rate at which data-
products can be inserted.

> Now, the reason why a large queue would impact the I/O
> speed needs to be found. My guess is that having a large queue will
> somehow have an effect on how the operating system caches the I/Os,
> and therefore we do a lot more real I/Os.

If the problem lies with the computation of the MD5 checksum, then
the size of the product-queue shouldn't matter.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: NKX-533675
Department: Support IDD TIGGE
Priority: Normal
Status: Closed