REGEXP_COUNT

Counts the number of occurences that a string matches a given regular expression pattern.

See also: REGEXP_INSTR, REGEXP_SUBSTR.

Syntax

REGEXP_COUNT( string_expr, string_test_expr [, start_index ] ) --> INT

Arguments

Parameter Description
string_expr String to test
string_test_expr Test pattern
start_index The character index offset to start counting from. Defaults to 1

Test patterns

Pattern Description
^ Match the beginning of a string
$ Match the end of a string
. Match any character (including whitespace such as carriage return and newline)
* Match the preceding pattern zero or more times
+ Match the preceding pattern at least once
? Match the preceding pattern once at most
de|abc Match either de or abc
(abc)* Match zero or more instances of the sequence abc
{2} Match the preceding pattern exactly two times
{2,4} Match the preceding pattern between two and four times
[a-dX], [^a-dX] Matches any character that is (or is not when negated with ^) either a, b, c, d, or X. The - character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. To include a literal ] character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. Any character that does not have a defined special meaning inside a [] pair matches only itself.

Returns

Integer result of the number of times that a pattern occurs in a string.

Notes

  • The test pattern must be literal string. Column references or complex expressions are currently unsupported.
  • If the value is NULL, the result is NULL.

Examples

For these examples, assume a table named nba, with the following structure:

CREATE TABLE nba
(
   "Name" varchar(40),
   "Team" varchar(40),
   "Number" tinyint,
   "Position" varchar(2),
   "Age" tinyint,
   "Height" varchar(4),
   "Weight" real,
   "College" varchar(40),
   "Salary" float
 );

Here’s a peek at the table contents (Download nba.csv):

nba.csv
Name Team Number Position Age Height Weight College Salary
Avery Bradley Boston Celtics 0.0 PG 25.0 6-2 180.0 Texas 7730337.0
Jae Crowder Boston Celtics 99.0 SF 25.0 6-6 235.0 Marquette 6796117.0
John Holland Boston Celtics 30.0 SG 27.0 6-5 205.0 Boston University  
R.J. Hunter Boston Celtics 28.0 SG 22.0 6-5 185.0 Georgia State 1148640.0
Jonas Jerebko Boston Celtics 8.0 PF 29.0 6-10 231.0   5000000.0
Amir Johnson Boston Celtics 90.0 PF 29.0 6-9 240.0   12000000.0
Jordan Mickey Boston Celtics 55.0 PF 21.0 6-8 235.0 LSU 1170960.0
Kelly Olynyk Boston Celtics 41.0 C 25.0 7-0 238.0 Gonzaga 2165160.0
Terry Rozier Boston Celtics 12.0 PG 22.0 6-2 190.0 Louisville 1824360.0

Find players with at least 3 names (2 spaces)

nba=> SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+')>1;
Name
------------------------
James Michael McAdoo
Luc Richard Mbah a Moute
Larry Nance Jr.
Metta World Peace
Glenn Robinson III
Johnny O'Bryant III
Tim Hardaway Jr.
Frank Kaminsky III
Kelly Oubre Jr.
Otto Porter Jr.

Using the offset index

Start finding spaces that appear 8 characters in

nba=> SELECT "Name" FROM nba WHERE REGEXP_COUNT("Name", '( )+', 8)>1;
Name
------------------------
Luc Richard Mbah a Moute