2

如何提取字符串中所有可用电子邮件地址的列表,作为逗号/分号分隔的列表

SELECT dbo.getEmailAddresses('this is misc andrew@g.com')

--output andrew@g.com

SELECT dbo.getEmailAddresses('this is misc andrew@g.com and a medium text returning %John@acme.com')
--output andrew@g.com; John@acme.com
4

3 回答 3

1

这是一个UDF。SET @MailTempl='[A-Za-z0-9_.-]如果您需要其他信息(例如电子邮件地址中的非英文符号),请更正此行。这是允许的字符的完整列表

CREATE FUNCTION [dbo].[getEmailAddresses] (@Str varchar(8000))  
RETURNS varchar(8000) AS  
BEGIN 

declare @i int, @StartPos int,@AtPos int,@EndPos int;
declare @MailList varchar(8000);
declare @MailTempl varchar(100);


SET @MailList=NULL;
SET @MailTempl='[A-Za-z0-9_.-]'; --allowing symbols in e-mail not including @

SET @AtPos=PATINDEX('%'+@MailTempl+'@'+@MailTempl+'%',@Str)+1;
While @AtPos>1
begin
  --go left
  SET @i=@AtPos-1;
  while (substring(@Str,@i,1) like @MailTempl) SET @i=@i-1;
  SET @StartPos=@i+1;

  --go right
  SET @i=@AtPos+1;
  while (substring(@Str,@i,1) like @MailTempl) SET @i=@i+1;
  SET @EndPos=@i-1;

  SET @MailList=isnull(@MailList+';','')+Substring(@Str,@StartPos,@EndPos-@StartPos+1);

  --prepare for the next round
  SET @Str=substring(@Str,@EndPos+1,LEn(@Str));
  SET @AtPos=PATINDEX('%'+@MailTempl+'@'+@MailTempl+'%',@Str)+1;

end;

RETURN @MailList;

END
于 2012-08-23T07:30:54.197 回答
1

尝试这个:

create function getEmailAddresses
(
@test varchar(max)
)
returns varchar(max)
As
BEGIN
declare @emaillist varchar(max)
--SET @test=' this is it by it a@b.com dsdkjl dsaldkj a@b.com dasdlk c@bn.com dsafhjkf anand@p.com d fdajf s@s.com .'

;WITH CTE as(
select reverse(left(reverse(left(@test,CHARINDEX('.com',@test)+3)),charindex(' ',reverse(left(@test,CHARINDEX('.com',@test)+3))))) as emailids,
right(@test,len(@test)-(CHARINDEX('.com',@test)+3)) rem
union all
select CASE WHEN len(rem)>2 then reverse(left(reverse(left(rem,CHARINDEX('.com',rem)+3)),charindex(' ',reverse(left(rem,CHARINDEX('.com',rem)+3))))) else 'a' end as emailids ,
CASE WHEN len(rem) > 2 then right(rem,len(rem)-(CHARINDEX('.com',rem)+3)) else 'a' end rem
from CTE where LEN(rem)>2
)
select @emaillist =STUFF((select ','+emailids  from CTE for XML PATH('')),1,1,'')
return @emaillist
END

select dbo.getEmailAddresses('this is it by it a@b.com dsdkjl dsaldkj a@b.com dasdlk c@bn.com dsafhjkf anand@p.com d fdajf s@s.com .')
于 2012-08-23T08:22:27.920 回答
1

尝试这个

Declare @str varchar(max) = 'this is misc andrew@g.com and a medium text returning John@acme.com'

    ;With Cte AS(
    SELECT
        Items = Split.a.value('.', 'VARCHAR(100)')
    FROM
            (
                SELECT 
                    CAST('<X>' + REPLACE(@str,  ' '  , '</X><X>') + '</X>' AS XML) AS Splitdata  

            ) X  

        CROSS APPLY Splitdata.nodes('/X') Split(a) )

SELECT Email = STUFF((
                SELECT ';'+ Items
                FROM Cte
                Where Items
                LIKE '[A-Z0-9]%[@][A-Z]%[.][A-Z]%'
                FOR XML PATH('')),1,1,'')

结果

电子邮件地址

andrew@g.com;John@acme.com

NB~你可能需要做以下

a)根据您的要求,您需要制作一个 TVF(表值函数)。您可以参考使用 Set base 方法在 Sql Server 中拆分函数的文章

b) Email validation like 子句可以使用,但对于更复杂的要求,您可能需要对其进行增强。

c) 如果需要,您可能必须在应用过滤子句之前清理数据。例如 %John@acme.com 是无效的电子邮件。因此请删除“%”符号,然后应用过滤子句。

但是正如有人提到的那样,最好不要在 Sql Server 端进行太多的字符串拆分/操作,我同意他的观点,所以这里有一个 C# 代码来实现相同的

static void Main(string[] args)
        {
            string str = "this is misc andrew@g.com and a medium text returning John@acme.com";

            var result = GetValidEmails(str).Aggregate((a,b)=>a+";" + b);
            Console.WriteLine(result);
            Console.ReadKey();
        }

        private static List<string> GetValidEmails(string input)
        {
            List<string> lstValidEmails = new List<string>();
            string regexPattern = @"^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$";
            foreach (string email in input.Split(' '))
            {
                if (new Regex(regexPattern, RegexOptions.IgnoreCase).IsMatch(email))
                {
                    lstValidEmails.Add(email);
                }
            }
            return lstValidEmails;
        }

希望这会有所帮助。

于 2012-08-23T09:06:15.623 回答