我什至有点犹豫是否要发布我的答案。你真的必须升级。版本 8.2 现已结束。按照发布的链接@a_horse_with_no_name。
然而,这个问题引起了我的注意。以下演示应适用于PostgreSQL 8.2:
SELECT -- content,
rtrim(
regexp_replace(
regexp_replace(
replace(
substring(content
,E'(XXX\\d+|XXX\\d+.*XXX\\d+)') -- kill leading / trailing noise
,',',' ') -- remove all ","
,E'(XXX\\d+)', E'\\1,', 'g') -- terminate X-strings with ","
-- now we can use non-greedy terminated with ","
,E'(XXX\\d+?,)*?.*?(XXX\\d+?,)', E'\\1\\2', 'g')
,',') AS result
FROM (VALUES
('no match')
,('XXX010101')
,('XXX010102 beginn')
,('end XXX010103')
,('middle XXX010104 match')
,('match middle XXX010105 and end XXX010106')
,('XXX010107, match beginn XXX010108 & middle')
,('XXX010109 begin and end XXX010110')
,('XXX01011 begin XXX010112 middle and end XXX010113')
,('XXX01014 begin XXX010115 close matches XX010113 XXXy010113 23624 ,XXX010116')
,('f XXX01017 B XXX010118 457XXX010119 XXXy XXX010120 overkill XXX010121end')
) data(content)
结果:
result
--------------------------------------------------
-- first line is NULL
XXX010101
XXX010102
XXX010103
XXX010104
XXX010105,XXX010106
XXX010107,XXX010108
XXX010109,XXX010110
XXX01011,XXX010112,XXX010113
XXX01014,XXX010115,XXX010116
XXX01017,XXX010118,XXX010119,XXX010120,XXX010121
一些解释:
这大量使用了 PostgreSQL 的POSIX 正则表达式的功能(链接到 8.2 版!)