Search This Blog

Total Pageviews

Sunday 22 January 2012

Oracle function REGEXP_INSTR

 


 




1 2 3 4 5 6 7
REGEXP_INSTR(<source_string>, <pattern>[[, <start_position>][, <occurrence>][, <return_option>][, <match_parameter>][, <sub_expression>]])


5. return_option - The default value of the return_option is 0, which returns the starting position of the pattern.
A value of 1 returns the starting position of the next character following the match




SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4) "REGEXP_INSTR" FROM DUAL;

first character in the fourth subexpression, which is '78':



REGEXP_INSTR( '1234567890' , '(123)(4(56)(78))' , 1 , 1 , 0 , 'i' , 4 )
----------- ------------------ -- -- -- -- --
1 2 3 4 5 6 7



1. source_string is a character expression that serves as the search value.Pattern is the regular expression.
It is usually a text literal .It can contain up to 512 bytes.

2. Pattern is the regular expression. It is usually a text literal .It can contain up to 512 bytes.
For more about pattern:

3. position is a positive integer indicating the character of source_string where Oracle should begin the search. The default is 1,
meaning that Oracle begins the search at the first character of source_string.


4. occurrence is a positive integer indicating which occurrence of pattern in source_string Oracle should search for.
The default is 1, meaning that Oracle searches for the first occurrence of pattern.


5. return_option lets you specify what Oracle should return in relation to the occurrence:
If you specify 0, then Oracle returns the position of the first character of the occurrence. This is the default.
If you specify 1, then Oracle returns the position of the character following the occurrence.


6. match_parameter is a text literal that lets you change the default matching behavior of the function.
You can specify one or more of the following values for match_parameter:

i: to match case insensitively
c: to match case sensitively
n: to make the dot (.) match new lines as well
m: to make ^ and $ match beginning and end of a line in a multiline string
x: to ignore white spaces.


7. subexpr: REGEXP_SUBSTR functions include a new SUBEXPR parameter that limits the pattern match to a specific subexpression in the search pattern.



Anchoring Characters

^ Anchor the expression to the start of a line
$ Anchor the expression to the end of a line


[[:digit:]]{5} ---- any five digit




Posix Characters




Character Class Description


[:digit:] Only the digits 0 to 9

[:alnum:] Any alphanumeric character 0 to 9 OR A to Z or a to z.

[:alpha:] Any alpha character A to Z or a to z.

[:blank:] Space and TAB characters only.

[:xdigit:] Hexadecimal notation 0-9, A-F, a-f.

[:punct:] Punctuation symbols
------------------------------------------------------------
% . , " ' ? ! : # $ & ( ) * ; + - / = @ [ ] \ ^ _ { } | ~
------------------------------------------------------------

[:print:] Any printable character.

[:space:] Any whitespace characters (space, tab, NL, FF, VT, CR).
Many system abbreviate as \s.

[:graph:] Exclude whitespace (SPACE, TAB). Many system abbreviate as \W.

[:upper:] Any alpha character A to Z.

[:lower:] Any alpha character a to z.

[:cntrl:] Control Characters NL CR LF TAB VT FF NUL SOH STX
EXT EOT ENQ ACK SO SI DLE DC1 DC2 DC3 DC4 NAK SYN
ETB CAN EM SUB ESC IS1 IS2 IS3 IS4 DEL.






Quantifier Characters control the number of times a character
or string is found in a search.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Character Class Description
* Match 0 or more times
? Match 0 or 1 time
+ Match 1 or more times
{m} Match exactly m times
{m,} Match at least m times
{m, n} Match at least m times but no more than n times
\n Cause the previous expression to be repeated n times



Character Class Description

| Separates alternates, often used with grouping operator ()
( ) Groups subexpression into a unit for alternations, for quantifiers, or for backreferencing (see "Backreferences" section)
[char] Indicates a character list; most metacharacters inside a character list are understood as literals, with the exception of character classes,
and the ^ and - metacharacters



More Special Characters:
\t tab (HT, TAB)
\n newline (LF, NL)
\r return (CR)
\f form feed (FF)
\a alarm (bell) (BEL)
\e escape (think troff) (ESC)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote (disable) pattern metacharacters till \E

Even More Special Characters:
\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character
\b Match a word boundary
\B Match a non-(word boundary)
\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
\G Match only where previous m//g left off (works only with /g)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~






3 ---- refer to 3rd No above
--
SELECT REGEXP_COUNT('123123123123', '123', 3, 'i') COUNT FROM DUAL;

COUNT
----------
3
3 ---- refer to 3rd No above
--
SQL> SELECT REGEXP_COUNT('123123123123', '123', 2, 'i') COUNT FROM DUAL;


COUNT
----------
3




*
SQL> SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 2) "REGEXP_INSTR" FROM DUAL; --- * this will search pattern No 2 i.e. 4

REGEXP_INSTR
------------
4
**
SQL> SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 3) "REGEXP_INSTR" FROM DUAL; ---** this will search pattern No 3 i.e. 56


REGEXP_INSTR
------------
5


**
SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4) "REGEXP_INSTR" FROM DUAL; ---** this will search pattern No 3 i.e. 78


SQL> SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4) "REGEXP_INSTR" FROM DUAL;

REGEXP_INSTR
------------
7



SELECT REGEXP_INSTR('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 5) "REGEXP_INSTR" FROM DUAL; --- will search pattern No 5 in this '(123)(4(56)(78))' so no pattern in this case


REGEXP_INSTR
------------
0




it will search the o which is having 3 letter after o




SQL> SELECT REGEXP_INSTR('The slippery rabbit was pursued by a ravenous wolf', '[o][[:alpha:]]{3}', 1, 1, 0, 'i') RESULT FROM dual;

RESULT
----------
0

SQL> SELECT REGEXP_INSTR('The slippery rabbit was pursued by a ravenous wolf olf', '[o][[:alpha:]]{3}', 1, 1, 0, 'i') RESULT FROM dual;

RESULT
----------
0

SQL> SELECT REGEXP_INSTR('The slippery rabbit was pursued by a ravenous wolf olfw', '[o][[:alpha:]]{3}', 1, 1, 0, 'i') RESULT FROM dual;


RESULT
----------
52




search for blank sapce for 4th occurance

SELECT REGEXP_INSTR('500 Oracle Parkway, Redwood Shores, CA','[ ]+', 1, 4) "REGEXP_INSTR" FROM DUAL;

REGEXP_INSTR
------------
28



find letter after 3rd occurance of |

select REGEXP_SUBSTR('One|Two|Three|Four|Five','[^|]+', 1, 3) from dual ;

REGEX
-----
Three





SELECT REGEXP_INSTR('Joe Smith, 10045 Berry Lane, San Joseph, CA 91234','[[:digit:]]{5}$') AS REGEXP_INSTR FROM dual


REGEXP_INSTR
------------
45






Oracle DBA

anuj blog Archive