java - 在不使用正则表达式的情况下识别电子邮件字段

Question

我们有一个标记器，它标记一个文本文件。遵循的逻辑很奇怪，但在我们的上下文中是必要的。

一封电子邮件，例如 xyz.zyx@gmail.com

将产生以下标记： xyz . zyx @ gmail

我想知道如果我们只允许使用这些令牌，我们如何将该字段识别为电子邮件。不允许使用正则表达式。我们只能使用令牌及其周围的令牌来确定该字段是否是电子邮件字段

score 0 · Accepted Answer

检查令牌列表是否为电子邮件：

列表只包含一个标记@
令牌索引@！= 0
之后至少 3 个令牌@
之后至少有 1 个.令牌@，但不是紧随其后
以字符标记开始和结束

附加检查：

没有两个.后续标记
没有特殊字符
之后的字符标记长度@至少为 2
之前所有字符标记的总长度@至少为 3

score 0 · Accepted Answer

在逻辑上将电子邮件地址分为 3 部分：

一个用户名（或资源名），为了这个解释，我们称它为用户名
性格。
主机名，由任意数量的“单词点”序列 + 最终顶级域字符串组成。

像这样散步：

 while token can be part of a user name
    fetch next token;
    if there no more -> no e-mail;

check if the next token is @
if not -> no e-mail

while there are tokens
    while token can be part of a host name subpart (the "word" above)
        fetch next token;
        if there are no more -> might be a valid e-mail address

    check if the next token is a dot
    if not -> might be a valid e-mail address
    set a flag that you found at least one dot

   check if the next token can be part of a host name subpart
       if not -> no valid e-mail address (or maybe you ignore a trailing dot and take what was found so far)

如果需要更多令牌，请添加进一步检查。您可能还必须发布处理找到的令牌以确保有效的电子邮件地址，并且您可能必须倒带您的令牌生成器（或缓存获取的令牌）以防您没有找到有效的电子邮件地址并需要提供其他识别过程的相同输入。

score 0 · Accepted Answer

好的..尝试一些（坏）这样的逻辑......

  int i=0,j=0;
    if(str.contains(".") && str.contains("@"))
    {
     if((i=str.indexOf(".") < (j=str.indexOf("@")) 
    {
     if(i!=0 && i+1!=j)      //ignore Strings like .@ , abc.@ 
        return true;
    }
    }
    return false

java - 在不使用正则表达式的情况下识别电子邮件字段

3 回答 3

Related

Reference