REGEXP_COUNT

Counts the number of occurences that a string matches a given regular expression pattern.

See also: REGEXP_INSTR, REGEXP_SUBSTR.

Syntax

REGEXP_COUNT( string_expr, string_test_expr [, start_index ] ) --> INT

Arguments

Parameter

Description

string_expr

String to test

string_test_expr

Test pattern

start_index

The character index offset to start counting from. Defaults to 1

Test patterns

Pattern

Description

^

Match the beginning of a string

$

Match the end of a string

.

Match any character (including whitespace such as carriage return and newline)

*

Match the preceding pattern zero or more times

+

Match the preceding pattern at least once

?

Match the preceding pattern once at most

de|abc

Match either de or abc

(abc)*

Match zero or more instances of the sequence abc

{2}

Match the preceding pattern exactly two times

{2,4}

Match the preceding pattern between two and four times

[a-dX], [^a-dX]

Matches any character that is (or is not when negated with ^) either a, b, c, d, or X. The - character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. To include a literal ] character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. Any character that does not have a defined special meaning inside a [] pair matches only itself.

Returns

Integer result of the number of times that a pattern occurs in a string.

Notes

  • The test pattern must be literal string. Column references or complex expressions are currently unsupported.

  • If the value is NULL, the result is NULL.

Examples

For these examples, assume a table named nba, with the following structure:

CREATE TABLE nba
(
   "Name" text(40),
   "Team" text(40),
   "Number" tinyint,
   "Position" text(2),
   "Age" tinyint,
   "Height" text(4),
   "Weight" real,
   "College" text(40),
   "Salary" float
 );

Here’s a peek at the table contents (Download nba.csv):

nba.csv

Name

Team

Number

Position

Age

Height

Weight

College

Salary

Avery Bradley

Boston Celtics

0.0

PG

25.0

6-2

180.0

Texas

7730337.0

Jae Crowder

Boston Celtics

99.0

SF

25.0

6-6

235.0

Marquette

6796117.0

John Holland

Boston Celtics

30.0

SG

27.0

6-5

205.0

Boston University

R.J. Hunter

Boston Celtics

28.0

SG

22.0

6-5

185.0

Georgia State

1148640.0

Jonas Jerebko

Boston Celtics

8.0

PF

29.0

6-10

231.0

5000000.0

Amir Johnson

Boston Celtics

90.0

PF

29.0

6-9

240.0

12000000.0

Jordan Mickey

Boston Celtics

55.0

PF

21.0

6-8

235.0

LSU

1170960.0

Kelly Olynyk

Boston Celtics

41.0

C

25.0

7-0

238.0

Gonzaga

2165160.0

Terry Rozier

Boston Celtics

12.0

PG

22.0

6-2

190.0

Louisville

1824360.0

Find players with at least 3 names (2 spaces)

nba=> SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+')>1;
Name
------------------------
James Michael McAdoo
Luc Richard Mbah a Moute
Larry Nance Jr.
Metta World Peace
Glenn Robinson III
Johnny O'Bryant III
Tim Hardaway Jr.
Frank Kaminsky III
Kelly Oubre Jr.
Otto Porter Jr.

Using the offset index

Start finding spaces that appear 8 characters in

nba=> SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1;
Name
------------------------
Luc Richard Mbah a Moute