c - 写入 char* 数组的 char 以某种方式发生变异

Question

这应该是一个简单的函数（它计算字符串中唯一字符的数量），但我遇到了一个奇怪的问题。请注意，我的代码使用只需要 ASCII 字母 az 和 AZ。

int unique_chars(char* my_str) {
//printf("starting unique_chars\n");
  char seen_buffer[52]; // max 52 letters a-z & A-Z
  int seen_count = 1; // not ever expecting my_str to be NULL 
  int i, j;
  char next;
//printf("first char is %c\n", my_str[0]);
  seen_buffer[0] = my_str[0]; // first char must be unique

  for (i=1; i<strlen(my_str); i++) { // walk along the rest of my_str
    next = my_str[i];

    if (next >= 97) {
      next = next - 32; // the next char will always be capital, for convenience
    }

    for (j=0; j<seen_count; j++) { // compare next to all the unique chars seen before
//printf("current char is %c, checking against %c\n", next, seen_buffer[j]);
      if ((next==seen_buffer[j]) || (next+32==seen_buffer[j])) {
//printf("breaking\n");
        break; // jump to the next char in my_str if we find a match
      }
      if (j==seen_count-1) { // at this point, we're sure that next hasn't been seen yet
//printf("new unique char is %c\n", next);
        seen_count++;
        seen_buffer[seen_count] = next;
//printf("new char val is %c, should be %c\n", seen_buffer[seen_count], next);
        break;
      }
    }
  }
  return seen_count;
}

int main(int argc, char* argv[]){
  char* to_encode = argv[1];
  printf("unique chars: %d\n", unique_chars(to_encode));
}

当我用某些字符串调用时，我得到不正确的结果。例如，尝试：

./a.out gghhiijj

产生（printf未注释）：

starting unique_chars
first char is g
current char is G, checking against g
breaking
current char is H, checking against g
new unique char is H
new char val is H, should be H
current char is H, checking against g
current char is H, checking against 
new unique char is H
new char val is H, should be H
current char is I, checking against g
current char is I, checking against 
current char is I, checking against H
new unique char is I
new char val is I, should be I
current char is I, checking against g
current char is I, checking against 
current char is I, checking against H
current char is I, checking against H
new unique char is I
new char val is I, should be I
current char is J, checking against g
current char is J, checking against 
current char is J, checking against H
current char is J, checking against H
current char is J, checking against I
new unique char is J
new char val is J, should be J
current char is J, checking against g
current char is J, checking against 
current char is J, checking against H
current char is J, checking against H
current char is J, checking against I
current char is J, checking against I
new unique char is J
new char val is J, should be J

所以我一直在我的 seen_buffer 中得到重复，因为一些空白字符存储在那里，而不是应该在那里的字母字符！然而，当我在写入 seen_buffer 后立即进行比较时（即新字符 val 是 %c，应该是 %c\n），会显示正确的字符！

任何帮助表示赞赏！

score 1 · Accepted Answer

你在这里有一个错误：

    seen_count++;
    seen_buffer[seen_count] = next;

第一个字符进入seen_buffer[0]并seen_count设置为 1。这意味着之后进入的下一个新字符递增到 2。没有任何内容进入（这是您在 printfs 中不断看到的空白字符），并且在检查字符时，你永远不会检查你刚刚输入的最后一个字符。seen_buffer[2]seen_countseen_buffer[1]seen_buffer

交换这些行，它应该可以工作。

score 0 · Accepted Answer

您没有检查输入的值是否在 A 到 Z 和 a 到 z 之间

此外，您当前的代码可能在 char seen_buffer[52] 处出现缓冲区溢出，因此在代码中添加边界检查

我认为您的代码可以非常简单

这是一个简单的算法给你

 unsigned int returnUniqueChar (const char *input)
 {
     int count[52] = {0}; // initialize all the memory with zero
     int unique = 0;

     while (*input != '\0')
     {
          if ( (input >= 'A' && input <= 'Z') 
          {
                count[input -'A']++;
          }

           else if (input >= 'a' && input <= 'z')
          {
              count[input -'a'+ 26]++;
          }
          input++;
     }

     for (int i = 0; i < 56 && (count[i] == 1) ; i++ )
            unique++;

    return unique;

 }

c - 写入 char* 数组的 char 以某种方式发生变异

2 回答 2

Related

Reference