这是一个从字符串中提取所有数字的版本;即鉴于I'm 35 years old; I was born in 1982. The average family has 2.4 children.
这将返回35198224
。即,如果您有可能已格式化为代码的数字数据(例如#123,456,789
/ 123-00005
),这很好,但如果您要提取特定数字(即与数字相反/只是数字字符),则不合适从文中。它也只处理数字;所以不会返回负号 ( -
) 或句点.
)。
declare @table table (id bigint not null identity (1,1), data nvarchar(max))
insert @table (data)
values ('hello 123 its 45613 then') --outputs: 12345613
,('1 some other string 98 example 4') --outputs: 1984
,('AB ABCDE # 123') --outputs: 123
,('ABCDE# 123') --outputs: 123
,('AB: ABC# 123') --outputs: 123
; with NonNumerics as (
select id
, data original
--the below line replaces all digits with blanks
, replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(data,'0',''),'1',''),'2',''),'3',''),'4',''),'5',''),'6',''),'7',''),'8',''),'9','') nonNumeric
from @table
)
--each iteration of the below CTE removes another non-numeric character from the original string, putting the result into the numerics column
, Numerics as (
select id
, replace(original, substring(nonNumeric,1,1), '') numerics
, replace(nonNumeric, substring(nonNumeric,1,1), '') charsToreplace
, len(replace(nonNumeric, substring(nonNumeric,1,1), '')) charsRemaining
from NonNumerics
union all
select id
, replace(numerics, substring(charsToreplace,1,1), '') numerics
, replace(charsToreplace, substring(charsToreplace,1,1), '') charsToreplace
, len(replace(charsToreplace, substring(charsToreplace,1,1), '')) charsRemaining
from Numerics
where charsRemaining > 0
)
--we select only those strings with `charsRemaining=0`; i.e. the rows for which all non-numeric characters have been removed; there should be 1 row returned for every 1 row in the original data set.
select * from Numerics where charsRemaining = 0
该代码通过用空格替换给定字符串中的所有数字(即我们想要的字符)来工作。然后它通过原始字符串(包括数字)删除所有留下的字符(即非数字字符),从而只留下数字。
我们分两步执行此操作,而不是首先删除所有非数字字符的原因是只有 10 位数字,而可能的字符数量很多;所以替换那个小列表相对较快;然后给我们一个实际存在于字符串中的非数字字符的列表,这样我们就可以替换那个小集合。
该方法利用递归 SQL,使用公用表表达式 (CTE)。