1

假设我有一个String代表s在拼字游戏板上移动的跟注:"AARDV RK"

我有一个HashSet<String>包含dict整个 Scrabble Dictionary(约 180,000 个单词!)的电话。

我怎么能使用正则表达式来搜索dicts但是空白字符代表任何大写字母?

4

3 回答 3

0

描述

无需查找随机字母,您只需查找适合玩家当前托盘托盘的所有单词。

考虑以下正则表达式和逻辑的 powershell 示例。在这里,我使用逻辑来构建基于玩家当前拥有的瓷砖的正则表达式。生成的正则表达式有两个不同的部分,匹配 1 到玩家托盘中每个字母的总数,第二部分匹配玩家托盘中每个字母的 0 到 +1。因此,玩家A的托盘中有 2 ,正则表达式将尝试在 0 和 3 之间进行匹配,A同时对于所有其他字母,每个字母的总数仍然需要 1。这个过程对每个字母进行迭代。

所以给出你的例子。如果玩家aarvfrk的托盘中有,正则表达式会查找所有包含所有字母的单词 likeaardvarkaardvarks也会匹配,但是我们稍后会通过根据 word length 过滤掉匹配项来消除它where {$_.Length -le $($PlayerTiles.Length + 1)}。所以一个单词不能有超过 2 个额外的牌然后存在于玩家托盘中。

接下来,我构建了一个正则表达式,用于在找到的单词中查找玩家当前在其托盘中没有的字母。

当然,在某些极端情况下,这种特定的逻辑可能会失败,例如玩家缺少两个字母来拼写单词。在给定的电路板布局可能包含您要查找的字母的极少数情况下,了解这些单词可能会有所帮助。您可以通过评估棋盘布局来解决这个问题,并在棋盘上包括所有单个字母,就好像它们是玩家托盘的一部分一样。这种评估需要足够聪明,以识别由于电路板布局而无法使用的字母。它还可以变得足够智能,以从电路板布局中识别出合法使用的多个字符串。但这一切都超出了您最初问题的范围。

笔记

根据您选择的语言,您可能需要*?用类似{0,100}. 这是由于语言 [ like java ] 如何实现它的后向搜索字符串可能是未确定的大小。

源代码

    $Matches = @()
    [array]$Dictionary = @()

    $Dictionary += 'AARDVARK'
    $Dictionary += 'AARDVRKS'
    $Dictionary += 'AARDVARKS'
    $Dictionary += 'ANTHILL'
    $Dictionary += 'JUMPING'
    $Dictionary += 'HILLSIDE'
    $Dictionary += 'KITTENS'
    $Dictionary += 'LOVER'
    $Dictionary += 'LOVE'
    $Dictionary += 'LOVES'
    $Dictionary += 'LOVELY'
    $Dictionary += 'OLIVE'
    $Dictionary += 'VOTE'


    $PlayerTiles = "aardvrk"

Function funBuildRegexForPlayerTiles ([string]$GivenTiles) {

    # split the GivenTiles so each letter is seperate, and store these in a hashtable so the letter is the keyname and the number times it's seen is the value, This deduplicates each letter
    [hashtable]$SearchForTiles = @{}
    foreach ($Letter in $GivenTiles[0..$($GivenTiles.Length - 1)] ) {
        $SearchForTiles[$Letter] += 1
        } # next letter

    # build regex for tiles to match just the tiles we have 
    [string]$SameNumberRegex = ""
    foreach ($Letter in $SearchForTiles.Keys) {
        $SameNumberRegex += "(?=^([^$Letter]*?$Letter){1,$($SearchForTiles[$Letter])}(?![^$Letter]*?$Letter))"
        } # next letter


    # add to the regex to include one extra letter of each type. 
    [array]$ExtraLetterRegex = @()
    foreach ($MasterLetter in $SearchForTiles.Keys) {
        [string]$TempRegex = ""
        foreach ($Letter in $SearchForTiles.Keys) {
            if ($MasterLetter -ieq $Letter) {
                # this forces each letter to allow zero to one extra of itself in the dictionary string. This allows us to match words which would have all the other letters and none of this letter
                $TempRegex += "(?=^([^$Letter]*?$Letter){0,$($SearchForTiles[$Letter] + 1)}(?![^$Letter]*?$Letter))"

                } else {
                # All the rest of these tiles on this regex section will need to have just the number of tiles the player has
                $TempRegex += "(?=^([^$Letter]*?$Letter){1,$($SearchForTiles[$Letter])}(?![^$Letter]*?$Letter))"
                } # end if
            } # next letter
        $ExtraLetterRegex += $TempRegex

        Write-Host "To match an extra '$MasterLetter': " $TempRegex
        } # next MasterLetter

    # put it all together
    [array]$AllRegexs = @()
    $AllRegexs += $SameNumberRegex
    $AllRegexs += $ExtraLetterRegex


    # stitch all the regexs together to make a massive regex 
    [string]$Output = $AllRegexs -join "|"

    return $Output
    } # end function funBuildRegexForPlayerTiles        


Function funBuildMissingLetterRegex ([string]$GivenTiles) {
    # split the GivenTiles so each letter is seperate, and store these in a hashtable so the letter is the keyname and the number times it's seen is the value, This deduplicates each letter
    [hashtable]$SearchForTiles = @{}
    foreach ($Letter in $GivenTiles[0..$($GivenTiles.Length - 1)] ) {
        $SearchForTiles[$Letter] += 1
        } # next letter

    [array]$MissingLetterRegex = @()
    # include any letters which do not match the current tiles
    $MissingLetterRegex += "(?i)([^$($SearchForTiles.Keys -join '')])"

    # build the regex to find the missing tiles
    foreach ($Letter in $SearchForTiles.Keys) {
        $MissingLetterRegex += "(?i)(?<=($Letter[^$Letter]*?){$($SearchForTiles[$Letter])})($Letter)"
        } # next letter

    [string]$Output = $MissingLetterRegex -join "|"
    return $Output
    } # end function


    [string]$Regex = funBuildRegexForPlayerTiles -GivenTiles $PlayerTiles
    Write-Host "Player tiles '$PlayerTiles'"
    Write-Host "Regex = '$Regex'"
    Write-Host "Matching words = "  
    $MatchedWords = $Dictionary -imatch $Regex | where {$_.Length -le $($PlayerTiles.Length + 1)}

    [string]$MissingLetterRegex = funBuildMissingLetterRegex $PlayerTiles
    foreach ($Word in $MatchedWords) {
        Write-Host $Word -NoNewline
        # find all the letters for which the player doesn't have a matching tile
        [array]$MissingTiles = ([regex]"$MissingLetterRegex").matches($Word) | foreach {
            Write-Output $_.Groups[0].Value
            } # next match
        Write-Host "`tLetters you are missing to spell this work '$($MissingTiles -join '')'"
        } # next word

    Write-Host -------------------------------

    $PlayerTiles = "OLLVE"
    [hashtable]$SearchForTiles = @{}

    # build regex for tiles
    [string]$Regex = funBuildRegexForPlayerTiles -GivenTiles $PlayerTiles


    Write-Host "Player tiles '$PlayerTiles'"
    Write-Host "Regex = '$Regex'"
    Write-Host
    Write-Host "Matching words = "  
    $MatchedWords = $Dictionary -imatch $Regex | where {$_.Length -le $($PlayerTiles.Length + 1)}

    [string]$MissingLetterRegex = funBuildMissingLetterRegex $PlayerTiles
    foreach ($Word in $MatchedWords) {
        Write-Host $Word -NoNewline
        # find all the letters for which the player doesn't have a matching tile
        [array]$MissingTiles = ([regex]"$MissingLetterRegex").matches($Word) | foreach {
            Write-Output $_.Groups[0].Value
            } # next match
        Write-Host "`tLetters you are missing to spell this work '$($MissingTiles -join '')'"
        } # next word

产量

To match an extra 'r':  (?=^([^r]*?r){0,3}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))
To match an extra 'v':  (?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){0,2}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))
To match an extra 'a':  (?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){0,3}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))
To match an extra 'k':  (?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){0,2}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))
To match an extra 'd':  (?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){0,2}(?![^d]*?d))
Player tiles 'aardvrk'
Regex = '(?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))|(?=^([^r]*?r){0,3}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))|(?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){0,2}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))|(?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){0,3}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))|(?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){0,2}(?![^k]*?k))(?=^([^d]*?d){1,1}(?![^d]*?d))|(?=^([^r]*?r){1,2}(?![^r]*?r))(?=^([^v]*?v){1,1}(?![^v]*?v))(?=^([^a]*?a){1,2}(?![^a]*?a))(?=^([^k]*?k){1,1}(?![^k]*?k))(?=^([^d]*?d){0,2}(?![^d]*?d))'
Matching words = 
AARDVARK    Letters you are missing to spell this work 'A'
AARDVRKS    Letters you are missing to spell this work 'S'
-------------------------------
To match an extra 'O':  (?=^([^O]*?O){0,2}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))
To match an extra 'E':  (?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){0,2}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))
To match an extra 'L':  (?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){0,3}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))
To match an extra 'V':  (?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){0,2}(?![^V]*?V))
Player tiles 'OLLVE'
Regex = '(?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))|(?=^([^O]*?O){0,2}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))|(?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){0,2}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))|(?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){0,3}(?![^L]*?L))(?=^([^V]*?V){1,1}(?![^V]*?V))|(?=^([^O]*?O){1,1}(?![^O]*?O))(?=^([^E]*?E){1,1}(?![^E]*?E))(?=^([^L]*?L){1,2}(?![^L]*?L))(?=^([^V]*?V){0,2}(?![^V]*?V))'

Matching words = 
LOVER   Letters you are missing to spell this work 'R'
LOVE    Letters you are missing to spell this work ''
LOVES   Letters you are missing to spell this work 'S'
LOVELY  Letters you are missing to spell this work 'Y'
OLIVE   Letters you are missing to spell this work 'I'
VOTE    Letters you are missing to spell this work 'T'

概括

我们正在寻找匹配单词的第一部分我使用正则表达式组成了这些块(?=^([^$Letter]*?$Letter){1,$($SearchForTiles[$Letter])}(?![^$Letter]*?$Letter)):所有这些块都由|or 语句分隔。

  • (?=开始零宽度断言
    • ^匹配字符串的开头
    • (创建一组所需的字符序列
    • [^$Letter]*?匹配任何字符,但我们正在寻找零次或多次非贪婪的字母
    • $Letter匹配字母
    • )关闭组
  • {强迫团体发生
    • 1至少一次
    • ,
    • $($SearchForTiles[$Letter])最多为玩家拥有的该图块的总数
    • }结束数量检查
  • (?!使用环顾四周以防止任何
    • [^$Letter]*?任意数量的不是这个字母的字符
    • $Letter后跟这封信
    • )环顾四周
  • )这个零宽度断言的结尾基本上是寻找这封信的结尾

当搜索我正在使用的单词中缺少的字母时,(?i)([^$($SearchForTiles.Keys -join '')])后跟每个字母的这些块(?i)(?<=($Letter[^$Letter]*?){$($SearchForTiles[$Letter])})($Letter)。所有这些夹头都由|or 语句分隔

  • (?i)强制不区分大小写
    • (开始组检查
    • [^不包括这些字符
    • $($SearchForTiles.Keys -join '')取出玩家拼贴集中的每个去重字母并将它们连接在一起
    • ]字符集的结尾
    • )基本上返回所有不在玩家托盘中的字母
  • |or 语句
  • 接下来是玩家托盘中的每个字母的一组这样的组
  • (?i)强制不区分大小写
  • (?<=开始向后看
  • (必须按此顺序的起始字符组
    • $Letter寻找这封信
    • [^$Letter]后跟任何不是这个字母的字符
    • *?零次或多次
    • )关闭必须按此顺序排列的字符组
    • {$($SearchForTiles[$Letter])}在我们开始匹配丢失的字母之前,该组必须在玩家的 trya 中的每个图块中至少存在一次
    • )关闭后视
  • ($Letter)匹配我们正在寻找的字母,如果匹配,则玩家的托盘中缺少该字母,因此该字母将被退回。
于 2013-05-12T04:45:35.537 回答
0

由于字典只包含大写字母,因此最简单的选项将是最好的:

final Pattern p = Pattern.compile("AARDV.RK")`
for (String entry : dict)
  if (p.matcher(entry).matches()) return entry;
return null;

通配符.将匹配该位置的任何字符,这将使您免于对该字符进行任何类型的冗余检查的轻微惩罚。另请注意,预先编译正则表达式而不是为每个条目重新编译它是非常重要的。

于 2013-05-11T20:39:31.157 回答
0

Something like this:

^AARDV[A-Z]RK$
于 2013-05-11T20:30:58.613 回答