我认为您可能会花费大部分时间来尝试匹配字母网格不可能构建的单词。所以,我要做的第一件事就是尝试加快这一步,这应该能让你大部分时间到达那里。
为此,我会将网格重新表示为您正在查看的字母转换索引的可能“移动”表。
首先从整个字母表中为每个字母分配一个数字(A=0、B=1、C=2、...等等)。
让我们看这个例子:
h b c d
e e g h
l l k l
m o f p
现在,让我们使用我们拥有的字母的字母表(通常您可能希望每次都使用相同的整个字母表):
b | c | d | e | f | g | h | k | l | m | o | p
---+---+---+---+---+---+---+---+---+---+----+----
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11
然后你创建一个二维布尔数组,告诉你是否有某个字母转换可用:
| 0 1 2 3 4 5 6 7 8 9 10 11 <- from letter
| b c d e f g h k l m o p
-----+--------------------------------------
0 b | T T T T
1 c | T T T T T
2 d | T T T
3 e | T T T T T T T
4 f | T T T T
5 g | T T T T T T T
6 h | T T T T T T T
7 k | T T T T T T T
8 l | T T T T T T T T T
9 m | T T
10 o | T T T T
11 p | T T T
^
to letter
现在浏览您的单词列表并将单词转换为转换:
hello (6, 3, 8, 8, 10):
6 -> 3, 3 -> 8, 8 -> 8, 8 -> 10
然后通过在表中查找这些转换来检查是否允许这些转换:
[6][ 3] : T
[3][ 8] : T
[8][ 8] : T
[8][10] : T
如果它们都被允许,那么就有可能找到这个词。
例如,可以在第 4 次转换(m 到 e:helMEt)时排除单词“helmet”,因为表中的该条目是错误的。
并且可以排除仓鼠这个词,因为不允许第一个(h 到 a)转换(甚至不存在于您的表中)。
现在,对于您没有消除的可能很少的剩余单词,请尝试按照您现在的方式或按照此处其他一些答案中的建议在网格中实际找到它们。这是为了避免由于网格中相同字母之间的跳跃而导致的误报。例如,表格允许使用“帮助”一词,但网格不允许使用“帮助”一词。
关于这个想法的一些进一步的性能改进提示:
不要使用二维数组,而是使用一维数组并简单地自己计算第二个字母的索引。因此,不要像上面那样创建一个 12x12 数组,而是创建一个长度为 144 的一维数组。如果您总是使用相同的字母表(即标准英文字母表的 26x26 = 676x1 数组),即使不是所有字母都显示在您的网格中,您可以将索引预先计算到此一维数组中,您需要对其进行测试以匹配您的字典单词。例如,上面示例中“hello”的索引将是
hello (6, 3, 8, 8, 10):
42 (from 6 + 3x12), 99, 104, 128
-> "hello" will be stored as 42, 99, 104, 128 in the dictionary
将想法扩展到 3D 表(表示为 1D 数组),即所有允许的 3 字母组合。这样,您可以立即消除更多单词,并将每个单词的数组查找次数减少 1:对于“hello”,您只需要 3 个数组查找:hel、ell、llo。顺便说一句,构建这张表会非常快,因为您的网格中只有 400 个可能的 3 字母移动。
预先计算需要包含在表格中的网格中移动的索引。对于上面的示例,您需要将以下条目设置为“True”:
(0,0) (0,1) -> here: h, b : [6][0]
(0,0) (1,0) -> here: h, e : [6][3]
(0,0) (1,1) -> here: h, e : [6][3]
(0,1) (0,0) -> here: b, h : [0][6]
(0,1) (0,2) -> here: b, c : [0][1]
.
:
- 还可以在具有 16 个条目的一维数组中表示您的游戏网格,并在 3 中预先计算表格。包含该数组的索引。
我敢肯定,如果你使用这种方法,你可以让你的代码运行得非常快,前提是你已经预先计算了字典并且已经加载到内存中。
顺便说一句:如果您正在构建游戏,另一件好事是立即在后台运行这些东西。开始生成和解决第一个游戏,而用户仍在查看您应用程序上的标题屏幕并让他的手指就位以按“播放”。然后在用户玩上一个游戏时生成并解决下一个游戏。这应该给你很多时间来运行你的代码。
(我喜欢这个问题,所以我可能会很想在接下来的几天里用 Java 实现我的提议,看看它实际上会如何执行......一旦我这样做了,我会在这里发布代码。)
更新:
好的,我今天有时间用Java实现了这个想法:
class DictionaryEntry {
public int[] letters;
public int[] triplets;
}
class BoggleSolver {
// Constants
final int ALPHABET_SIZE = 5; // up to 2^5 = 32 letters
final int BOARD_SIZE = 4; // 4x4 board
final int[] moves = {-BOARD_SIZE-1, -BOARD_SIZE, -BOARD_SIZE+1,
-1, +1,
+BOARD_SIZE-1, +BOARD_SIZE, +BOARD_SIZE+1};
// Technically constant (calculated here for flexibility, but should be fixed)
DictionaryEntry[] dictionary; // Processed word list
int maxWordLength = 0;
int[] boardTripletIndices; // List of all 3-letter moves in board coordinates
DictionaryEntry[] buildDictionary(String fileName) throws IOException {
BufferedReader fileReader = new BufferedReader(new FileReader(fileName));
String word = fileReader.readLine();
ArrayList<DictionaryEntry> result = new ArrayList<DictionaryEntry>();
while (word!=null) {
if (word.length()>=3) {
word = word.toUpperCase();
if (word.length()>maxWordLength) maxWordLength = word.length();
DictionaryEntry entry = new DictionaryEntry();
entry.letters = new int[word.length() ];
entry.triplets = new int[word.length()-2];
int i=0;
for (char letter: word.toCharArray()) {
entry.letters[i] = (byte) letter - 65; // Convert ASCII to 0..25
if (i>=2)
entry.triplets[i-2] = (((entry.letters[i-2] << ALPHABET_SIZE) +
entry.letters[i-1]) << ALPHABET_SIZE) +
entry.letters[i];
i++;
}
result.add(entry);
}
word = fileReader.readLine();
}
return result.toArray(new DictionaryEntry[result.size()]);
}
boolean isWrap(int a, int b) { // Checks if move a->b wraps board edge (like 3->4)
return Math.abs(a%BOARD_SIZE-b%BOARD_SIZE)>1;
}
int[] buildTripletIndices() {
ArrayList<Integer> result = new ArrayList<Integer>();
for (int a=0; a<BOARD_SIZE*BOARD_SIZE; a++)
for (int bm: moves) {
int b=a+bm;
if ((b>=0) && (b<board.length) && !isWrap(a, b))
for (int cm: moves) {
int c=b+cm;
if ((c>=0) && (c<board.length) && (c!=a) && !isWrap(b, c)) {
result.add(a);
result.add(b);
result.add(c);
}
}
}
int[] result2 = new int[result.size()];
int i=0;
for (Integer r: result) result2[i++] = r;
return result2;
}
// Variables that depend on the actual game layout
int[] board = new int[BOARD_SIZE*BOARD_SIZE]; // Letters in board
boolean[] possibleTriplets = new boolean[1 << (ALPHABET_SIZE*3)];
DictionaryEntry[] candidateWords;
int candidateCount;
int[] usedBoardPositions;
DictionaryEntry[] foundWords;
int foundCount;
void initializeBoard(String[] letters) {
for (int row=0; row<BOARD_SIZE; row++)
for (int col=0; col<BOARD_SIZE; col++)
board[row*BOARD_SIZE + col] = (byte) letters[row].charAt(col) - 65;
}
void setPossibleTriplets() {
Arrays.fill(possibleTriplets, false); // Reset list
int i=0;
while (i<boardTripletIndices.length) {
int triplet = (((board[boardTripletIndices[i++]] << ALPHABET_SIZE) +
board[boardTripletIndices[i++]]) << ALPHABET_SIZE) +
board[boardTripletIndices[i++]];
possibleTriplets[triplet] = true;
}
}
void checkWordTriplets() {
candidateCount = 0;
for (DictionaryEntry entry: dictionary) {
boolean ok = true;
int len = entry.triplets.length;
for (int t=0; (t<len) && ok; t++)
ok = possibleTriplets[entry.triplets[t]];
if (ok) candidateWords[candidateCount++] = entry;
}
}
void checkWords() { // Can probably be optimized a lot
foundCount = 0;
for (int i=0; i<candidateCount; i++) {
DictionaryEntry candidate = candidateWords[i];
for (int j=0; j<board.length; j++)
if (board[j]==candidate.letters[0]) {
usedBoardPositions[0] = j;
if (checkNextLetters(candidate, 1, j)) {
foundWords[foundCount++] = candidate;
break;
}
}
}
}
boolean checkNextLetters(DictionaryEntry candidate, int letter, int pos) {
if (letter==candidate.letters.length) return true;
int match = candidate.letters[letter];
for (int move: moves) {
int next=pos+move;
if ((next>=0) && (next<board.length) && (board[next]==match) && !isWrap(pos, next)) {
boolean ok = true;
for (int i=0; (i<letter) && ok; i++)
ok = usedBoardPositions[i]!=next;
if (ok) {
usedBoardPositions[letter] = next;
if (checkNextLetters(candidate, letter+1, next)) return true;
}
}
}
return false;
}
// Just some helper functions
String formatTime(long start, long end, long repetitions) {
long time = (end-start)/repetitions;
return time/1000000 + "." + (time/100000) % 10 + "" + (time/10000) % 10 + "ms";
}
String getWord(DictionaryEntry entry) {
char[] result = new char[entry.letters.length];
int i=0;
for (int letter: entry.letters)
result[i++] = (char) (letter+97);
return new String(result);
}
void run() throws IOException {
long start = System.nanoTime();
// The following can be pre-computed and should be replaced by constants
dictionary = buildDictionary("C:/TWL06.txt");
boardTripletIndices = buildTripletIndices();
long precomputed = System.nanoTime();
// The following only needs to run once at the beginning of the program
candidateWords = new DictionaryEntry[dictionary.length]; // WAAAY too generous
foundWords = new DictionaryEntry[dictionary.length]; // WAAAY too generous
usedBoardPositions = new int[maxWordLength];
long initialized = System.nanoTime();
for (int n=1; n<=100; n++) {
// The following needs to run again for every new board
initializeBoard(new String[] {"DGHI",
"KLPS",
"YEUT",
"EORN"});
setPossibleTriplets();
checkWordTriplets();
checkWords();
}
long solved = System.nanoTime();
// Print out result and statistics
System.out.println("Precomputation finished in " + formatTime(start, precomputed, 1)+":");
System.out.println(" Words in the dictionary: "+dictionary.length);
System.out.println(" Longest word: "+maxWordLength+" letters");
System.out.println(" Number of triplet-moves: "+boardTripletIndices.length/3);
System.out.println();
System.out.println("Initialization finished in " + formatTime(precomputed, initialized, 1));
System.out.println();
System.out.println("Board solved in "+formatTime(initialized, solved, 100)+":");
System.out.println(" Number of candidates: "+candidateCount);
System.out.println(" Number of actual words: "+foundCount);
System.out.println();
System.out.println("Words found:");
int w=0;
System.out.print(" ");
for (int i=0; i<foundCount; i++) {
System.out.print(getWord(foundWords[i]));
w++;
if (w==10) {
w=0;
System.out.println(); System.out.print(" ");
} else
if (i<foundCount-1) System.out.print(", ");
}
System.out.println();
}
public static void main(String[] args) throws IOException {
new BoggleSolver().run();
}
}
以下是一些结果:
对于原始问题(DGHI ...)中发布的图片中的网格:
Precomputation finished in 239.59ms:
Words in the dictionary: 178590
Longest word: 15 letters
Number of triplet-moves: 408
Initialization finished in 0.22ms
Board solved in 3.70ms:
Number of candidates: 230
Number of actual words: 163
Words found:
eek, eel, eely, eld, elhi, elk, ern, erupt, erupts, euro
eye, eyer, ghi, ghis, glee, gley, glue, gluer, gluey, glut
gluts, hip, hiply, hips, his, hist, kelp, kelps, kep, kepi
kepis, keps, kept, kern, key, kye, lee, lek, lept, leu
ley, lunt, lunts, lure, lush, lust, lustre, lye, nus, nut
nuts, ore, ort, orts, ouph, ouphs, our, oust, out, outre
outs, oyer, pee, per, pert, phi, phis, pis, pish, plus
plush, ply, plyer, psi, pst, pul, pule, puler, pun, punt
punts, pur, pure, puree, purely, pus, push, put, puts, ree
rely, rep, reply, reps, roe, roue, roup, roups, roust, rout
routs, rue, rule, ruly, run, runt, runts, rupee, rush, rust
rut, ruts, ship, shlep, sip, sipe, spue, spun, spur, spurn
spurt, strep, stroy, stun, stupe, sue, suer, sulk, sulker, sulky
sun, sup, supe, super, sure, surely, tree, trek, trey, troupe
troy, true, truly, tule, tun, tup, tups, turn, tush, ups
urn, uts, yeld, yelk, yelp, yelps, yep, yeps, yore, you
your, yourn, yous
对于在原始问题(FXIE ...)中作为示例发布的字母
Precomputation finished in 239.68ms:
Words in the dictionary: 178590
Longest word: 15 letters
Number of triplet-moves: 408
Initialization finished in 0.21ms
Board solved in 3.69ms:
Number of candidates: 87
Number of actual words: 76
Words found:
amble, ambo, ami, amie, asea, awa, awe, awes, awl, axil
axile, axle, boil, bole, box, but, buts, east, elm, emboli
fame, fames, fax, lei, lie, lima, limb, limbo, limbs, lime
limes, lob, lobs, lox, mae, maes, maw, maws, max, maxi
mesa, mew, mewl, mews, mil, mile, milo, mix, oil, ole
sae, saw, sea, seam, semi, sew, stub, swam, swami, tub
tubs, tux, twa, twae, twaes, twas, uts, wae, waes, wamble
wame, wames, was, wast, wax, west
对于以下 5x5 网格:
R P R I T
A H H L N
I E T E P
Z R Y S G
O G W E Y
它给出了这个:
Precomputation finished in 240.39ms:
Words in the dictionary: 178590
Longest word: 15 letters
Number of triplet-moves: 768
Initialization finished in 0.23ms
Board solved in 3.85ms:
Number of candidates: 331
Number of actual words: 240
Words found:
aero, aery, ahi, air, airt, airth, airts, airy, ear, egest
elhi, elint, erg, ergo, ester, eth, ether, eye, eyen, eyer
eyes, eyre, eyrie, gel, gelt, gelts, gen, gent, gentil, gest
geste, get, gets, gey, gor, gore, gory, grey, greyest, greys
gyre, gyri, gyro, hae, haet, haets, hair, hairy, hap, harp
heap, hear, heh, heir, help, helps, hen, hent, hep, her
hero, hes, hest, het, hetero, heth, hets, hey, hie, hilt
hilts, hin, hint, hire, hit, inlet, inlets, ire, leg, leges
legs, lehr, lent, les, lest, let, lethe, lets, ley, leys
lin, line, lines, liney, lint, lit, neg, negs, nest, nester
net, nether, nets, nil, nit, ogre, ore, orgy, ort, orts
pah, pair, par, peg, pegs, peh, pelt, pelter, peltry, pelts
pen, pent, pes, pest, pester, pesty, pet, peter, pets, phi
philter, philtre, phiz, pht, print, pst, rah, rai, rap, raphe
raphes, reap, rear, rei, ret, rete, rets, rhaphe, rhaphes, rhea
ria, rile, riles, riley, rin, rye, ryes, seg, sel, sen
sent, senti, set, sew, spelt, spelter, spent, splent, spline, splint
split, stent, step, stey, stria, striae, sty, stye, tea, tear
teg, tegs, tel, ten, tent, thae, the, their, then, these
thesp, they, thin, thine, thir, thirl, til, tile, tiles, tilt
tilter, tilth, tilts, tin, tine, tines, tirl, trey, treys, trog
try, tye, tyer, tyes, tyre, tyro, west, wester, wry, wryest
wye, wyes, wyte, wytes, yea, yeah, year, yeh, yelp, yelps
yen, yep, yeps, yes, yester, yet, yew, yews, zero, zori
为此,我使用了TWL06 Tournament Scrabble Word List,因为原始问题中的链接不再有效。这个文件是 1.85MB,所以它有点短。该buildDictionary
函数会抛出所有少于 3 个字母的单词。
以下是关于此性能的一些观察:
它比 Victor Nicollet 的 OCaml 实现的报告性能慢了大约 10 倍。这是否是由不同的算法、他使用的较短的字典、他的代码是在 Java 虚拟机中编译和运行的事实,或者是我们计算机的性能(我的是运行 WinXP 的 Intel Q6600 @ 2.4MHz)引起的,我不知道。但它比原始问题末尾引用的其他实现的结果要快得多。所以,这个算法是否优于 trie 字典,我目前不知道。
中使用的表格方法可以checkWordTriplets()
很好地近似实际答案。它通过的 3-5 个单词中只有 1 个不会通过checkWords()
测试(请参阅上面的候选人数与实际单词数)。
上面看不到的东西:该checkWordTriplets()
函数大约需要 3.65 毫秒,因此在搜索过程中占主导地位。该checkWords()
函数几乎占用了剩余的 0.05-0.20 毫秒。
该checkWordTriplets()
函数的执行时间与字典大小成线性关系,几乎与棋盘大小无关!
的执行时间checkWords()
取决于板子大小和不排除的字数checkWordTriplets()
。
上面的checkWords()
实现是我想出的最愚蠢的第一个版本。它基本上根本没有优化。但相比之下checkWordTriplets()
它与应用程序的整体性能无关,所以我并不担心。但是,如果电路板尺寸变大,此功能将变得越来越慢,最终开始变得重要。然后,它也需要优化。
这段代码的一个好处是它的灵活性:
- 您可以轻松更改电路板大小:更新第 10 行并将字符串数组传递给
initializeBoard()
.
- 它可以支持更大/不同的字母,并且可以处理诸如将“Qu”视为一个字母之类的事情,而不会产生任何性能开销。为此,需要更新第 9 行以及将字符转换为数字的几个位置(目前只需从 ASCII 值中减去 65)
好的,但我认为现在这篇文章已经足够长了。我绝对可以回答您可能遇到的任何问题,但让我们将其移至评论中。