An Introduction to Regular Expressions Matching Rules 41 of 50 : [PREV] [NEXT]

Rule 1

The Engine tries to match as far left in the string as it can, so that the entire regular expression (not necessarily the entire string) matches according to Rule 2.

In order to do this, its first choice is to start just before the first character, and then try to match the entire regular expression. The regular expression matches if and only if the Engine reaches the end of the regular expression before it runs off the end of the string. If it matches, it quits immediately. Otherwise, the process continues.

The match doesn't have to reach the end of the string, unless there's an assertion in the regular expression that instructs it to do so. Before advancing to the second position (between the first and second characters), the Engine tries all of the possibilities, from right to left. If it exhausts all possibilities at the first position, it realizes that its very first choice was wrong, and proceeds to its second choice. It goes to the second position in the string and tries all the possibilities again. The pattern match as a whole doesn't fail until it has tried to match the entire regular expression at every position in the string.

The positions the Engine tries to match are between the characters. Consider the pattern /t*/, which can match zero or more t's. If you try this pattern on the string "Bart" it matches the null string before the first character of the string, rather than matching the "t" later in the string. Any regular expression that can match the null string (zero-width) is guarenteed to match at the leftmost position in the string.

The following sequence of strings shows how the engine matches the pattern /ar/ to the string "Bart". The red code highlights the substring that the Engine is testing against the pattern:

Bart // Does "Bart" match? ... NO
Bart // Does "t" match? ... NO
Bart // Does "rt" match? ... NO
Bart // Does "art" match? ... NO
Bart // Does "" (the null string) match? ... NO
Bart // Does "art" match? ... NO
Bart // Does "ar" match? ... YES

© 2003 Barbie barbie@missbarbell.co.uk Home http://birmingham.pm.org/