[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TIGGE #EGL-584516]: LDM regexp matching



Baudouin,

> I assume the regex engine is the same as perl's.

The LDM system uses extended regular expressions (ERE) as defined in version 2 
of the UNIX Standard:

    http://www.opengroup.org/onlinepubs/007908799/xbd/re.html#tag_007_004

Note that this is not the same as the regular expressions used by the perl(1) 
utility.

> \d{8} is the same as
> \d\d\d\d\d\d\d\d means "8 digits" (in this case this matches the date).

The ERE "[0-9]{8}" corresponding to the perl(1) RE "\d{8}".

> It seems that the patterns such as
> "\(00|20|40|60|80\)$" dont behave as I expect. I expect this to match
> anything that finishes by 00 or 20 or 40 or 60 or 80. It seems that this
> matches anything with 00 or 20 or 40 .".. anywhere, not just at the end.

The ERE "(00|20|40|60|80)$" corresponds to the perl(1) RE 
"\(00|20|40|60|80\)$", if I understand the intent correctly.

> For example
> tigge_ecmf_pf_20060331_1200_192_12_potential_temperature_0.grib:51
> matches "^tigge.*\.grib:\(00|20|40|60|80\)$" (the match is on the 60 in
> the date).

The string "tigge_ecmf_pf_20060331_1200_192_12_potential_temperature_0.grib:51" 
does not match the ERE
"^tigge.*\.grib:\(00|20|40|60|80\)$".  It would, however, match the ERE 
"^tigge.*(00|20|40|60|80)$" -- although it is not clear to me that that is what 
you should do because matching on the last digit of the year and the first 
digit of a two-digit day doesn't seem very robust.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: EGL-584516
Department: Support IDD TIGGE
Priority: Normal
Status: Open