regex - How does this pattern match hyphen without escape? -



regex - How does this pattern match hyphen without escape? -

after toddling in regex101 few minutes, realized ] not need escaped, if follws [.

in regex101, pattern []-a-z] described

/[]-a-z]/ []-a-z] match single character nowadays in list below ]-a single character in range between ] , (case sensitive) -z single character in list -z literally (case sensitive)

but thought, if - has matched literally without beingness escaped, should either go @ beginning, or @ end.

then why pattern not recognized error? why -z matches single character in list -z literally ?

let's break down:

[]-a-z] ^^ ^ || +---- 3 |+------ 2 +------- 1

1 literal ] since appears @ start of pattern, , [] invalid character class in pcre.

the 2 hyphen hence sec character in class, , introduces range, between ] , a.

the next hyphen, 3, treated literally, because previous token, a end of previous range. range cannot introduced @ point. in pcre, - treated literally if it's in place range cannot introduced or if it's escaped. place literal hyphens @ start or end of range create obvious, not required.

then, z simple literal.

pcre follows perl syntax. documented so:

about ]:

a ] either end of posix character class (see posix character classes below), or signals end of bracketed character class. if want include ] in set of characters, must escape it. however, if ] first (or sec if first character caret) character of bracketed character class, not denote end of class (as cannot have empty class) , considered part of set of characters can matched without escaping.

about hyphens:

if hyphen in character class cannot syntactically part of range, instance because first or lastly character of character class, or if follows range, hyphen isn't special, , considered character matched literally. if want hyphen in set of characters matched , position in class such considered part of range, must escape hyphen backslash.

note refers perl syntax. other flavors may have different behavior. instance, [] valid (empty) character class in javascript cannot match anything.

the grab that, depending on options, pcre interpret in js way (there's couple of js compatibility flags). pcre2 docs:

an opening square bracket introduces character class, terminated closing square bracket. closing square bracket on own not special default. if closing square bracket required fellow member of class, should first info character in class (after initial circumflex, if present) or escaped backslash. means that, default, empty class cannot defined. however, if pcre2_allow_empty_class alternative set, closing square bracket @ start end (empty) class.

the documented pcre behavior hyphen is, unsurprisingly, matching perl behavior:

the minus (hyphen) character can used specify range of characters in character class. example, [d-m] matches letter between d , m, inclusive. if minus character required in class, must escaped backslash or appear in position cannot interpreted indicating range, typically first or lastly character in class, or after range. for example, [b-d-z] matches letters in range b d, hyphen character, or z.

regex

Comments

Popular posts from this blog

java - How to set log4j.defaultInitOverride property to false in jboss server 6 -

c - GStreamer 1.0 1.4.5 RTSP Example Server sends 503 Service unavailable -

Using ajax with sonata admin list view pagination -