Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Regex how-to: Quantifiers, pattern collections, and word boundaries

September 16, 2022
Bob Reselman
Related topics:
Linux
Related products:
Red Hat Enterprise Linux

Share:

    Filtering and searching text with regular expressions is an important skill for every developer. Regular expressions can be tricky to master. To work with them effectively, you need a detailed understanding of their symbols and syntax.

    Fortunately, learning to work with regular expressions can be incremental. You don't need to learn everything all at once to do useful work. Rather, you can start with the basics and then move into more complex topics while developing your understanding and using what you know as you go along.

    This article is the second in a series. The first article introduced some basic elements of regular expressions: The basic metacharacters (.*^$\s\d) as well as the escape metacharacter \.

    This article introduces some more advanced syntax: quantifiers, pattern collections, groups, and word boundaries. If you haven't read the first article, you might want to review it now before continuing with this content.

    These articles demonstrate regular expressions by piping string output from an echo command to the grep utility. The grep utility uses a regular expression to filter content. The benefit of demonstrating regular expressions using grep is that you don't need to set up any special programming environment. You can execute an example of a regular expression immediately by copying and pasting the code directly into your terminal window running under Linux.

    What's the difference between a regular character and a metacharacter

    A regular character is a letter, digit, or punctuation used in everyday text. When you declare a regular character in a regular expression, the regular expression engine searches content for that declared character. For example, were you to declare the regular character h in a regular expression, the engine would look for occurrences of the character h.

    A metacharacter is a placeholder symbol. For example, the metacharacter . (dot) represents "any character," and means any character matches here. The metacharacter \d represents a numerical digit, and means any digit matches here. Thus, when you use a metacharacter, the regex engine searches for characters that comply with the particular metacharacter or set of metacharacters.

    What are quantifiers?

    A quantifier is a syntactic structure in regular expressions that indicates the number of times a character occurs in sequence in the input text. There are two ways to declare a quantifier. One way is:

    x{n}

    In this syntax:

    • x is the character to match.
    • n indicates the number of times the character needs to occur.

    A related syntax declares a quantifier with a minimum and maximum range:

    x{n,m}

    In this syntax:

    • x is the character to match.
    • n indicates the minimum number of occurrences and m indicates the maximum number of occurrences.

    The following example uses a quantifier to create a matching pattern that identifies two occurrences of the regular character g in sequence:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po 'g{2}'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    Thus, the regular expression returns the following result:

    gg

    The following example uses a quantifier to create a matching pattern that identifies a minimum and a maximum for occurrences of the character g in a sequence. The minimum length is 1 and the maximum is 2. The regular expression is processed in a case-insensitive manner, as indicated by the -i option to grep:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Poi 'g{1,2}'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    Because each sequence is identified and returned on a one-by-one basis, the output is:

    G
    gg
    g

    What are pattern collections?

    A pattern collection is a syntactic structure that describes a character class. A character class is a set of metacharacters and regular characters that combine to create a matching pattern that, like a metacharacter, can match many different characters in text. A pattern collection is defined between square brackets ([ ]).

    The following example uses the [A-Z] character class, which denotes any uppercase character from A to Z inclusive, to create a pattern collection that matches only uppercase characters in the given text:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety." $ echo $teststr | grep -Po '[A-Z]'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    J
    L
    G
    F
    C
    T

    The following example uses the [0-9] character class, which denotes any digit between 0 and 9, to create a pattern collection that matches only numeric characters in the given text:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '[0-9]'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    1

    The following example uses a pattern collection that matches certain exact regular characters within a set of regular characters. The regular expression says: Match any f, G, or F:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '[fGF]'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    f
    f
    G
    F

    The following example uses a pattern collection with both metacharacters and regular characters. The logic behind the regular expression says: Match any g, r, or e followed by a space character and then the string Fido:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '[gre]\sFido'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    g Fido

    The following example uses two pattern collections along with metacharacters that are outside them. The regular expression says: Match a numeric character, then continue matching any character zero or many times that is followed by an uppercase character. The pattern collection [0-9] indicates any numeral from 0 to 9. The metacharacters .* indicate zero or more instances of any character, and the pattern collection [A-Z] indicates any uppercase character from A to Z:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '[0-9].*[A-Z]'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    1 bird named T

    The following example uses the negation metacharacter ^ within a pattern collection. The negation metacharacter indicates that the succeeding characters are not to be matched when the regular expression is being executed.

    Note: As you might remember from the first article in this series, ^ is the same metacharacter that indicates a line start—but only when used outside square brackets. The ^ metacharacter indicates negation only when it appears within the square brackets ([ ]) that declare a pattern collection.

    The following collection pattern says: Match any character that is not a, e, i, o, or u:

    $ teststr="Jeff and the pet Lucky."
    $ echo $teststr | grep -Po '[^aeiou]'

    The regular expression matches the characters highlighted in bold in the following text. The text is underlined to make the space characters apparent:

    Jeff and the pet Lucky.

    Space characters in the following output are also underlined to make them apparent. Space characters are matched by this regular expression:

    J
    f
    f
    _
    n
    d
    _
    t
    h
    _
    p
    t
    _
    L
    c
    k
    y
    . 

    Groups

    A group in a regular expression is, as the name implies, a group of characters declared according to a specific definition. A group declaration can include metacharacters and regular characters. A group is declared between open and closed parentheses like this: ( ).

    The following example uses a . (dot) metacharacter, which indicates "any character." The declared group says: Match any three characters as a group and return each group:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '(...)'

    The regular expression matches the characters highlighted in alternating bold and non-bold text as shown in the following text. Again, the text is underlined to make the space characters apparent:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    Because the group is identified and returned on a one-by-one basis, the output is:

    Jef
    f_a
    nd_
    the
    _pe
    t_L
    uck
    y._
    Gre
    gg_
    and
    _th
    e_d
    og_
    Fid
    o._
    Chr
    is_
    has
    _1_
    bir
    d_n
    ame
    d_T
    wee
    ty.
    

    The following example uses the . (dot) metacharacter along with the regular character y to define a group of three characters, of which the first two characters can be anything and the third character must be y.

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '(..y)'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    cky
    ety

    The following example demonstrates a regular expression group that uses the . (dot) metacharacter along with the \d metacharacter to define a group of five characters, of which the first two characters are any regular character, the third character is a digit, and the last two characters are any regular characters:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '(..\d..)'

    The regular expression matches the characters highlighted in bold in the following text. The text is underlined to make the space characters apparent.

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    s 1 b

    Word boundaries

    A word character is declared using the metacharacters \w. A word character indicates any uppercase character, lowercase character, numeric character, or connector character such as a hyphen.

    A word boundary is defined as a transition between a word character and a beginning space, an ending space, or a punctuation mark ( .!? ). A word boundary is declared using the metacharacters \b.

    The following example demonstrates a regular expression that uses the metacharacters \w+ to find occurrences of words within text. The metacharacter + indicates one or more occurrences of a character. The logic in play is: Match one or more word characters:

    $ teststr="Jeff and the pet Lucky.
    $ echo $teststr | grep -Po '\w+'

    The regular expression matches the characters highlighted in bold in the following text:

    Jeff and the pet Lucky

    Because each word is identified and returned on a one-by-one basis, the output is:

    Jeff
    and
    the
    pet
    Lucky

    The following example uses a word boundary to find occurrences of the regular character a that appears at the beginning of a word:

    "Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '\ba'

    The regular expression matches the characters highlighted in bold in the following text:

    and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    a
    a

    The following example uses a word boundary to find occurrences of the regular character y that appear at the end of a word:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po 'y\b'

    The regular expression matches the characters highlighted in bold in the following text. Note that punctuation marks at the end of a word are not considered word characters and are excluded from the match:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    y
    y

    The following example uses a word boundary to find occurrences of the regular characters Tweety that appear at the end of a word:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po 'Tweety\b'

    The regular expression matches the characters highlighted in bold in the following text. Again, notice that punctuation marks at the end of a word are excluded:

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    Tweety

    The following example contains a regular expression group that uses word boundaries to find occurrences of words that start with the regular character a and end with the regular character d. The regular expression uses the metacharacters \w* to declare all occurrences of word characters:

    $ teststr="Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety."
    $ echo $teststr | grep -Po '\ba\w*d\b'

    The regular expression matches the characters highlighted in bold in the following text.

    Jeff and the pet Lucky. Gregg and the dog Fido. Chris has 1 bird named Tweety.

    The output is:

    and
    and

    Grouping and specifying multiple characters simultaneously extend regular expressions

    This article gave you an introduction to working with quantifiers, pattern collections, groups, and word boundaries. You learned to use quantifiers to declare a range of character occurrences to match. Also, you learned that pattern collections enable you to declare character classes that match characters in a generic manner. Groups execute matches that declare a particular set of characters. Word boundaries allow you to make matches by working within the boundaries of space characters and punctuation marks.

    These intermediate concepts covered in this article will bring additional power and versatility to working regular expressions. But there's a lot more to learn. Fortunately, as mentioned at the beginning of this article, you can use the concepts and techniques discussed in this article immediately.

    The key is to start practicing what you've learned now. Mastery is the result of small, incremental accomplishments. As with any skill, the more you practice, the better you'll get.

    Last updated: November 6, 2023

    Recent Posts

    • Storage considerations for OpenShift Virtualization

    • Upgrade from OpenShift Service Mesh 2.6 to 3.0 with Kiali

    • EE Builder with Ansible Automation Platform on OpenShift

    • How to debug confidential containers securely

    • Announcing self-service access to Red Hat Enterprise Linux for Business Developers

    What’s up next?

    intermediate-linux cheat sheet cover

    This Linux cheat sheet introduces developers and system administrators to the Linux commands they should know.  You'll learn about text utilities, disk tools, network connectivity tools, user and user group management, and more.

    Download the free cheat sheet
    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit
    © 2025 Red Hat

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue