what’s the proper way to escape a regular expressi...
# announcements
w
what’s the proper way to escape a regular expression in intellij? I have my regex written out - can I just paste it when my cursor is in a quote block (
"<cursor>"
) and it escapes properly?
r
Ye, IntelliJ should escape it properly, if you paste it into string. Also assuming it's Kotlin, you should use triple quotes, eg
"""<cursor>"""
- less escaping needed that way, so it will be more readable
🙏 2
w
ah yes - that’s what i was thinking of
interesting - it changed my last character
$
to
${'$'}
using the triple quote
kind of funny to look at
e
the tricky bit with rawstrings is handling
$
, but luckily in a regex that rarely comes up - it only makes sense are the end of a pattern, where it can't be confused for interpolation
r
It has to -
$
starts an interpolation in a triple quotes string, and there’s no way to escape it, so instead you have to interpolate the character.
w
yeah i know - still funny to look at though 🙂
e
also, that constant interpolation is done by the compiler so it's not like it's building a string template at runtime just for
${'$'}
:)
w
so why doesn’t
Regex("""foo$""")
show an error in intellij?
e
ending a raw string with
$
is legal even if intellij would escape it during pasting:
Copy code
"""foo.*bar$""" == "foo.*bar\$"
it can't possibly be an interpolation, so it remains a
$
w
ah ok - so the bug is with pasting automatically escaping it then
r
tbh I wouldn't call it a bug: IntelliJ can't possibly know if it will remain at the end of string, so I think it's better to escape it always - would cause unintended bugs otherwise, while here it's only a few characters more and no bugs
👍 2
e
I dunno if it's a bug - it is consistent with how
$
appears in the middle of a raw string, and if you add identifier chars to the string you'll have to change it
v
it only makes sense are the end of a pattern, where it can't be confused for interpolation
Not really. You can use it to match a literal dollar, or in some look-ahead really meaning the end of the string, or in a alteration like
($|\n)
and similar things..
e
@Vampire ok, I wasn't that clear. in all those contexts
$
can't be confused for interpolation either
👍 1
as an aside, I find
"...".toRegex()
to be nicer to read than
Regex("...")
w
thanks for the help all
e
sometimes I'll use
Copy code
"""
    ...
""".toRegex(RegexOption.COMMENTS)
which will ignore spaces in the pattern (use
[ ]
or similar to encode space literals), if something is very complex and needs internal indentation or commenting to be readable
👍 1
w
for regex documentation i’ve found it very helpful to link to the https://regex101.com/ page with examples and has documentation automatically for each part of it
e
I don't find it that useful… you have to know each platform's supported Regex syntax (Kotlin just wraps Java/JS, both of which work differently from each other and even between versions and implementations) and it is too general
🆗 1
v
use 
[ ]
 or similar to encode space literals
That will not work though, that just will give you pattern syntax exception for the unclosed character class. 🙂
(?-x: )
should work
e
??? It does work, I use exactly that
v
With JVM backend?
Copy code
fun main() {
    println("""(?x)[ ]""".toRegex().matches(" "))
}
=>
Copy code
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 6
(?x)[ ]
      ^
 at java.util.regex.Pattern.error (Pattern.java:1957) 
 at java.util.regex.Pattern.clazz (Pattern.java:2550) 
 at java.util.regex.Pattern.sequence (Pattern.java:2065)
Spaces are ignored, you need to escape the space for example by using
\
or disable comment mode with
(?-x: )
At least on JVM
e
... I have to eat my hat. It works for /x patterns in Perl/PCRE but you're right about JVM
v
That's possible, every dialect has its specifics 😄
https://www.regular-expressions.info/ is an excellent source, also when it comes to differences between dialects
Ah, in newer Perl and PCRE2 you can use
(?xx)
to have the same as in Java where spaces in character classes are ignored too