I'm curious on the algorithm for deciding which characters to include, in a regex when using a -
...
Example: [a-zA-Z0-9]
This matches any character of any case, a through z, and numbers 0 through 9.
I had originally thought that they were used sort of like macros, for example, a-z
translates to a,b,c,d,e
etc.. but after I saw the following in an open source project,
text.tr('A-Za-z1-90', 'Ⓐ-Ⓩⓐ-ⓩ①-⑨⓪')
my paradigm on regex's has changed entirely, because these are characters that are not your typical characters, so how the heck did this work correctly, i thought to myself.
My theory is that the -
literally means
Any ASCII value between the left character, and the right character. (e.g. a-z [97-122])
Could anybody confirm if my theory is correct? Does the regex pattern in-fact calculate using the character codes, between any character?
Furthermore, if it IS correct, could you perform a regex match like,
A-z
because A
is 65
, and z
is 122
so theoretically, it should also match all characters between those values.