c - 在字典中查找单词 C 编程

Question

我有一个文本文件中的单词词典，我需要在文本文件中找到某些单词。例如由字母 {q, a, z, w, s, x, e, d, c, r, f, v,t,g,b} 或以 {d,o 结尾的单词，我们}。我正在寻找一种可以做到这一点的方法。将所有单词放入数组中最容易吗？还是我应该将其全部保存在文本文件中？我试过文本文件的方法，但被卡住了。这就是我所拥有的。非常感谢！

 int size, count;

 char *p;
 char *words[];

 FILE * dict_file;

 dict_file = fopen("MyDictionary.txt", "r");

fseek(dict_file, 0, SEEK_END); // seek to end of file
size = ftell(dict_file); // get current file pointer
fseek(dict_file, 0, SEEK_SET); // seek back to beginning of file
// proceed with allocating memory and reading the file


p = dictionary;
while (p = fgets(p, size, dict_file))
{
   p += strlen(p);

   words[count] = p;

   count++;
}

score 1 · Accepted Answer

显然，这是错误的：

FILE * dict_file;
fseek(dict_file, 0, SEEK_END); // seek to end of file
size = ftell(dict_file); // get current file pointer
fseek(dict_file, 0, SEEK_SET); // seek back to beginning of file
// proceed with allocating memory and reading the file
dict_file = fopen("MyDictionary.txt", "r");

在打开文件之前，您不能（正确）使用文件，因此中间三行肯定会产生一些不可预知的结果。最有可能该大小变为负数或零，这两者都可能会扰乱以下fgets调用。

这没有显示在您的代码中，但我希望您正在打电话malloc()或其他什么？

p = dictionary;

在修复上述错误时，您可能需要替换它：

  while (*p != '\0')
  {
        p += 1;
  }

和：

  p += strlen(p)-1;

[-1如果你真的想要'\0'每个字符串之间的

现在，话虽如此，我可能会采用指向每个字符串的指针数组的方法，而不是将所有内容存储在一个巨大的单个字符串中。这样，您可以简单地从一个字符串移动到另一个字符串。你仍然可以像上面那样使用你的长字符串，但是有一个带有指向每个字符串开头的指针的辅助变量[并保持零，所以从上面删除-1。

然后我会编写一个函数来执行“这个字符串是否由这些字母组成”和另一个函数“是以这些字母结尾的字符串”。如果您对通常如何进行字符串处理有所了解，那么两者都应该是相对微不足道的。

score 0 · Accepted Answer

如果您正在使用符合 POSIX 的系统，您可能想看看<regex.h>

这样您就可以通过正则表达式搜索您的单词。我猜是这样的：

"([qazwsxedcrfvtab]+)[^[:alpha:]]"
和"([[:alpha:]]*[dous])[^[:alpha:]]"

在您的情况下，但您应该确保使它们适应您的特定需求。

   int regcomp(regex_t *preg, const char *regex, int cflags);

   int regexec(const regex_t *preg, const char *string, size_t nmatch,
               regmatch_t pmatch[], int eflags);

   void regfree(regex_t *preg);

将是当时要查看的功能。

你可以使用类似的东西：

regext_t regex;
regmatch_t *match;

char *pos = p;
int n_matches;

regcomp (&regex, "your-regular-expression", REG_EXTENDED);
n_matches = regex.re_nsub + 1;
match = malloc (n * sizeof (regmatch_t));

while (!regexc (&regex, pos, n_matches, match, 0) {
  /* extract key and value from subpatterns
     available in match[i] for i-th submatch
     ... */

  pos += match[0].rm_eo;
}

regfree (&regex);
free (match);

c - 在字典中查找单词 C 编程

2 回答 2

Related

Reference