Consider the following toy example. I want to match in Go a name with a regexp where the name is sequences of letters a
separated by single #
, so a#a#aaa
is valid, but a#
or a##a
are not. I can code the regexp in the following two ways:
r1 := regexp.MustCompile(`^a+(#a+)*$`)
r2 := regexp.MustCompile(`^(a+#)*a+$`)
Both of these work. Now consider more complex task of matching a sequence of names separated by single slash. As in above, I can code that in two ways:
^N+(/N+)*$
^(N+/)*N+$
where N is a regexp for the name with ^ and $ stripped. As I have two cases for N, so now I can have 4 regexps:
^a+(#a+)*(/a+(#a+)*)*$
^(a+#)*a+(/a+(#a+)*)*$
^((a+#)*a+/)*a+(#a+)*$
^((a+#)*a+/)*(a+#)*a+$
The question is why when matching against the string "aa#a#a/a#a/a"
the first one fails while the rest 3 cases work as expected? I.e. what causes the first regexp to mismatch? The full code is:
package main
import (
"fmt"
"regexp"
)
func main() {
str := "aa#a#a/a#a/a"
regs := []string {
`^a+(#a+)*(/a+(#a+)*)*$`,
`^(a+#)*a+(/a+(#a+)*)*$`,
`^((a+#)*a+/)*a+(#a+)*$`,
`^((a+#)*a+/)*(a+#)*a+$`,
}
for _, r := range(regs) {
fmt.Println(regexp.MustCompile(r).MatchString(str))
}
}
Surprisingly it prints false true true true