if at all I just want the substring after _, what should be best to use?
It isn't quite clear if you want to only remove the exact string 'PREPROCESSINGLIST_'
, and if so whether that should only be matched at the start of the string or anywhere; or you want to remove anything up to the first underscore, or anything up to any underscore.
Depending on your actual data and the result you want to get, you can use regexp_replace()
as @FrankScmitt showed (with or without an anchor), or a plain replace()
, or a combination of instr()
and substr()
.
With some made-up data with various patterns provided in a CTE:
with t (str) as (
select 'PREPROCESSINGLIST_AOD' from dual
union all select 'PREPROCESSINGLIST_BOD' from dual
union all select 'PREPROCESSINGLIST_COD' from dual
union all select 'PREPROCESSINGLIST_DOD' from dual
union all select 'XYZ_PREPROCESSINGLIST_EOD' from dual
union all select 'XYZ_FOD' from dual
union all select 'ABC_XYZ_GOD' from dual
union all select 'HOD' from dual
)
select str,
regexp_replace(str, '^PREPROCESSINGLIST_', null) as anchor_regex,
regexp_replace(str, 'PREPROCESSINGLIST_', null) as free_regex,
replace(str, 'PREPROCESSINGLIST_', null) as free_replace,
case when instr(str, '_') > 0 then substr(str, instr(str, '_') + 1) else str end
as first_underscore,
case when instr(str, '_') > 0 then substr(str, instr(str, '_', -1) + 1) else str end
as last_underscore
from t;
STR ANCHOR_REGEX FREE_REGEX FREE_REPLAC FIRST_UNDERSCORE LAST_UNDERS
------------------------- ------------------------- ----------- ----------- --------------------- -----------
PREPROCESSINGLIST_AOD AOD AOD AOD AOD AOD
PREPROCESSINGLIST_BOD BOD BOD BOD BOD BOD
PREPROCESSINGLIST_COD COD COD COD COD COD
PREPROCESSINGLIST_DOD DOD DOD DOD DOD DOD
XYZ_PREPROCESSINGLIST_EOD XYZ_PREPROCESSINGLIST_EOD XYZ_EOD XYZ_EOD PREPROCESSINGLIST_EOD EOD
XYZ_FOD XYZ_FOD XYZ_FOD XYZ_FOD FOD FOD
ABC_XYZ_GOD ABC_XYZ_GOD ABC_XYZ_GOD ABC_XYZ_GOD XYZ_GOD GOD
HOD HOD HOD HOD HOD HOD
If you can get the result you need in more than one way then it is generally more efficient to avoid regular expressions, but sometimes they are the only (sane) choice. As always it's best to test the options yourself against your actual data to see what is most efficient - or at least efficient enough.