REGEXP_INSTR¶

The REGEXP_INSTR function returns the start position of a regex match and searches a string for a POSIX-style regular expression. This function returns the position within the string where the match was located.

Syntax¶

REGEXP_INSTR( string_expr, string_test_expr [, start_index [, occurence [, return_position ]] ) --> INT

Arguments¶

Parameter	Description
`string_expr`	String to test
`string_test_expr`	Test pattern
`start_index`	The character index offset to start counting from. Defaults to 1
`occurence`	Which occurence to search for. Defaults to 1
`return_position`	Specifies the location within the string to return. Using 0, the function returns the string position of the first character of the substring that matches the pattern. A value greater than 0 returns will return the position of the first character following the end of the pattern. Defaults to 0

Test patterns¶

Pattern	Description
`^`	Match the beginning of a string
`$`	Match the end of a string
`.`	Match any character (including whitespace such as carriage return and newline)
`*`	Match the preceding pattern zero or more times
`+`	Match the preceding pattern at least once
`?`	Match the preceding pattern once at most
`de\|abc`	Match either `de` or `abc`
`(abc)*`	Match zero or more instances of the sequence `abc`
`{2}`	Match the preceding pattern exactly two times
`{2,4}`	Match the preceding pattern between two and four times
`[a-dX]`, `[^a-dX]`	Matches any character that is (or is not when negated with `^`) either `a`, `b`, `c`, `d`, or `X`. The `-` character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. To include a literal `]` character, it must immediately follow the opening bracket [. To include a literal - character, it must be written first or last. Any character that does not have a defined special meaning inside a [] pair matches only itself.

Returns¶

Integer start position of a regex match, or 0 if no match was found.

Notes¶

The test pattern must be literal string. Column references or complex expressions are currently unsupported.
If the value is NULL, the result is NULL.

Examples¶

For these examples, assume a table named nba, with the following structure:

CREATE TABLE nba
(
   "Name" TEXT,
   "Team" TEXT,
   "Number" TINYINT,
   "Position" TEXT,
   "Age" TINYINT,
   "Height" TEXT,
   "Weight" REAL,
   "College" TEXT,
   "Salary" FLOAT
 );

Here’s a peek at the table contents (Download nba.csv):

nba.csv¶
Name	Team	Number	Position	Age	Height	Weight	College	Salary
Avery Bradley	Boston Celtics	0.0	PG	25.0	6-2	180.0	Texas	7730337.0
Jae Crowder	Boston Celtics	99.0	SF	25.0	6-6	235.0	Marquette	6796117.0
John Holland	Boston Celtics	30.0	SG	27.0	6-5	205.0	Boston University
R.J. Hunter	Boston Celtics	28.0	SG	22.0	6-5	185.0	Georgia State	1148640.0
Jonas Jerebko	Boston Celtics	8.0	PF	29.0	6-10	231.0		5000000.0
Amir Johnson	Boston Celtics	90.0	PF	29.0	6-9	240.0		12000000.0
Jordan Mickey	Boston Celtics	55.0	PF	21.0	6-8	235.0	LSU	1170960.0
Kelly Olynyk	Boston Celtics	41.0	C	25.0	7-0	238.0	Gonzaga	2165160.0
Terry Rozier	Boston Celtics	12.0	PG	22.0	6-2	190.0	Louisville	1824360.0

Find players with ‘ow’ in their name¶

nba=> SELECT "Name", REGEXP_INSTR("Name", 'ow') FROM nba WHERE REGEXP_COUNT("Name", 'ow')>0;
Name               | regexp_instr
-------------------+-------------
Jae Crowder        |            7
Markel Brown       |           10
Langston Galloway  |           14
Kyle Lowry         |            7
Norman Powell      |            9
Anthony Brown      |           11
Cameron Bairstow   |           15
Lorenzo Brown      |           11
Dirk Nowitzki      |            7
Dwight Powell      |            9
Dwight Howard      |            9
Justise Winslow    |           14
Karl-Anthony Towns |           15
Anthony Morrow     |           13

Using the `return_position` argument¶

Get the second occurence of the letter ‘k’ in a player’s name. We set start_index to 1 (the default)

nba=> SELECT "Name", REGEXP_INSTR("Name", 'k', 1, 2)  FROM nba WHERE REGEXP_INSTR("Name", 'k', 1, 2)>0;
Name               | regexp_instr
-------------------+-------------
Nik Stauskas       |           10
Tarik Black        |           11
Dirk Nowitzki      |           12
Sam Dekker         |            8
Kendrick Perkins   |           13
Frank Kaminsky III |           13
Nikola Jokic       |           10
Nikola Pekovic     |           10