Java - Regular Expression

Regular expressions are a sequence of characters that define a search pattern. Regular expressions are used to search and manipulate text. Java provides support for regular expressions through the java.util.regex package.

There are two ways to use regular expressions in Java:

  • Using the String class's matches() method.
  • Using the Pattern and Matcher classes.

We will discuss both approaches in this tutorial.

Using the String class's matches() method:

The matches() method of the String class is used to test if a string matches a regular expression. The matches() method returns true if the entire string matches the regular expression, and false otherwise.

String text = "Hello, World!";
if (text.matches("Hello,.*")) {
    System.out.println("The text starts with 'Hello,'");
}

In the above example, the matches() method is used to test if the text string starts with "Hello,". The regular expression "Hello,.*" matches any string that starts with "Hello," followed by zero or more characters.

Using the Pattern and Matcher classes:

The Pattern class is used to define a regular expression. The Matcher class is used to match a pattern against a string.

import java.util.regex.*;
String text = "The quick brown fox jumps over the lazy dog";
Pattern pattern = Pattern.compile("fox");
Matcher matcher = pattern.matcher(text);
while (matcher.find()) {
    System.out.println("Found the word '" + matcher.group() + "' at position " + matcher.start());
}

In the above example, the Pattern class is used to define a regular expression that matches the word "fox". The Matcher class is used to match the pattern against the text string. The find() method is used to find the first occurrence of the pattern in the text string. The group() method is used to get the matched text, and the start() method is used to get the position of the matched text.

Java Regular Expression methods:

Pattern.compile(String regex): This method is used to compile a regular expression into a Pattern object.

Pattern pattern = Pattern.compile("[a-z]+");

Matcher.matches(): This method is used to test if the entire string matches the pattern.

Matcher matcher = pattern.matcher("hello");
if (matcher.matches()) {
    System.out.println("The string matches the pattern");
}

Matcher.find(): This method is used to find the next match of the pattern in the string.

Matcher matcher = pattern.matcher("The quick brown fox jumps over the lazy dog");
while (matcher.find()) {
    System.out.println("Found the word '" + matcher.group() + "' at position " + matcher.start());
}

Matcher.group(): This method is used to get the matched text.

Matcher matcher = pattern.matcher("The quick brown fox jumps over the lazy dog");
if (matcher.find()) {
    System.out.println("Found the word '" + matcher.group() + "'");
}

Matcher.start(): This method is used to get the position of the matched text.

Matcher matcher = pattern.matcher("The quick brown fox jumps over the lazy dog");
if (matcher.find()) {
    System.out.println("The matched text starts at position " + matcher.start());
}

Regular expressions (regex) are patterns used to match character combinations in strings. In Java, regex are implemented through the java.util.regex package. The following are some commonly used metacharacters and quantifiers in Java regex:

Metacharacters:

  • "." - Matches any character except a newline character
  • "^" - Matches the start of a string
  • "$" - Matches the end of a string
  • "[]" - Matches a single character that is within the specified range or set of characters
  • "[^]" - Matches a single character that is not within the specified range or set of characters
  • "|" - Matches either the expression preceding it or the expression following it
  • "\" - Escapes special characters, allowing them to be used as literals

Quantifiers:

  • "*" - Matches zero or more occurrences of the preceding character or group
  • "+" - Matches one or more occurrences of the preceding character or group
  • "?" - Matches zero or one occurrence of the preceding character or group
  • "{n}" - Matches exactly n occurrences of the preceding character or group
  • "{n,m}" - Matches between n and m occurrences of the preceding character or group
  • "{n,}" - Matches at least n occurrences of the preceding character or group