Ach!

You've got an assignment to write some Java code that needs to match a regular expression with metacharacters!

And by metacharacters, I mean characters that the regex parser interprets to mean something like end of input or beginning of input.

Usually, they're symbols like \, $, ^, and ?.

So what do you when you have to match one or more of those symbols? 

You let a little-known Java method help you out. That's what you do.

I'll show you that method here.

An Example

Let's look at a really simple example. Consider this String:

final String amounts = "$1 $2 $20 $40 $100";

Now suppose you need to put together a regular expression search that finds all occurrences of "$1" in that String.

As you can see, that search should match twice. 

Under normal circumstances, that regex would be easy enough. But these ain't normal circumstances.

You see, that $ character is a metacharacter. That means the regex parser will interpret it as a command (in this case, it means end of input).

So if you do your usual thing:

final Pattern pattern = Pattern.compile("$1");
final Matcher matcher = pattern.matcher(amounts);

int matches = 0;
while (matcher.find()) {
    matches++;
}

System.err.println(matches);

You'll see the number 0  output in a nice shade of red.

But it should be 2. What happened?

That $ sign is messing you up.

Fortunately, there's a fix that's so easy even a caveman could do it.

The Caveman Fix

It's called Pattern.quote(). And it will save you lots of time.

What does Pattern.quote() do? It converts your regex into a pattern that escapes the metacharacters. So you don't have to worry about them.

Under the covers, it bookends your search string with a \Q and a \E

So let's see it in action:

final String amounts = "$1 $2 $20 $40 $100";
final String escaped = Pattern.quote("$1");
final Pattern pattern = Pattern.compile(escaped);
final Matcher matcher = pattern.matcher(amounts);

int matches = 0;
while (matcher.find()) {
    matches++;
}

System.err.println(matches);

Now run that and you'll get a beautfiul red 2 in the output.

Of Course You Could...

Yes, if you really wanted to, you could just concatenate the \Q and \E to either end of your search string.

But do you really think that's the best solution to a problem like this?

It's much better, in my not-so-humble opinion, to let Java do the work when it comes to concatenating strings and escaping metacharacters.

The code looks cleaner. You don't have to worry about "\\Q" + string + "\\E" or using StringBuilder to do it the "right" way.

Or escaping the escape (note the double slashes above).

You just use Pattern.quote() and everything's peachy.

Wrapping It Up

Now you know how to use Pattern.quote() to make your regex searching a little bit easier. Especially when it comes to metacharacters.

Put this one in your back pocket and use it the next time you need to perform a regex search for strings with symbols.

Have fun!

Photo by Andrea Piacquadio from Pexels