c - 从 C 中的 fscanf 字符串中删除特殊字符

Question

我目前正在使用以下代码扫描文本文件中的每个单词，将其放入变量中，然后对其进行一些操作，然后再转到下一个单词。这很好用，但我正在尝试删除所有不属于的字符，A-Z / a-z.例如，如果"he5llo"输入了我希望输出为"hello". 如果我无法修改fscanf，是否有办法在扫描后对变量进行修改？谢谢。

while (fscanf(inputFile, "%s", x) == 1)

score 3 · Accepted Answer

你可以给x这样的功能。为了便于理解，第一个简单版本：

// header needed for isalpha()
#include <ctype.h>

void condense_alpha_str(char *str) {
  int source = 0; // index of copy source
  int dest = 0; // index of copy destination

  // loop until original end of str reached
  while (str[source] != '\0') {
    if (isalpha(str[source])) {
      // keep only chars matching isalpha()
      str[dest] = str[source];
      ++dest;
    }
    ++source; // advance source always, wether char was copied or not
  }
  str[dest] = '\0'; // add new terminating 0 byte, in case string got shorter
}

它将就地遍历字符串，复制匹配isalpha()测试的字符，跳过并删除不匹配的字符。要理解代码，重要的是要意识到 C 字符串只是char数组，字节值 0 标记字符串的结尾。另一个重要的细节是，在 C 中，数组和指针在许多（不是全部！）方面都是相同的，所以指针可以像数组一样被索引。此外，这个简单的版本将重写字符串中的每个字节，即使字符串实际上没有改变。

然后是一个功能更全面的版本，它使用作为参数传递的过滤器函数，并且只会在 str 更改时执行内存写入，并str像大多数库字符串函数一样返回指向指针：

char *condense_str(char *str, int (*filter)(int)) {

  int source = 0; // index of character to copy

  // optimization: skip initial matching chars
  while (filter(str[source])) {
    ++source; 
  }
  // source is now index if first non-matching char or end-of-string

  // optimization: only do condense loop if not at end of str yet
  if (str[source]) { // '\0' is same as false in C

    // start condensing the string from first non-matching char
    int dest = source; // index of copy destination
    do {
      if (filter(str[source])) {
        // keep only chars matching given filter function
        str[dest] = str[source];
        ++dest;
      }
      ++source; // advance source always, wether char was copied or not
    } while (str[source]);
    str[dest] = '\0'; // add terminating 0 byte to match condenced string

  }

  // follow convention of strcpy, strcat etc, and return the string
  return str;
}

示例过滤器功能：

int isNotAlpha(char ch) {
    return !isalpha(ch);
}

示例调用：

char sample[] = "1234abc";
condense_str(sample, isalpha); // use a library function from ctype.h
// note: return value ignored, it's just convenience not needed here
// sample is now "abc"
condense_str(sample, isNotAlpha); // use custom function
// sample is now "", empty

// fscanf code from question, with buffer overrun prevention
char x[100];
while (fscanf(inputFile, "%99s", x) == 1) {
  condense_str(x, isalpha); // x modified in-place
  ...
}

参考：

读取int isalpha ( int c ); 手动的：

检查 c 是否是字母。
返回值：
如果 c 确实是字母，则该值不同于零（即 true）。否则为零（即假）

score 1 · Accepted Answer

luser droog 回答会起作用，但在我看来，它比必要的复杂。

你的简单例子你可以试试这个：

while (fscanf(inputFile, "%[A-Za-z]", x) == 1) {   // read until find a non alpha character
   fscanf(inputFile, "%*[^A-Za-z]"))  // discard non alpha character and continue
}

score 0 · Accepted Answer

0

您可以使用该isalpha()函数检查字符串中包含的所有字符

于 2013-04-07T16:55:09.160 回答

score 0 · Accepted Answer

~~scanf家庭功能不会这样做。您必须遍历字符串并使用它isalpha来检查每个字符。并memmove通过向前复制字符串的结尾来“删除”字符。~~

也许scanf毕竟可以做到。在大多数情况下，scanf如果匹配失败，朋友会将任何非空白字符推回输入流。

此示例scanf用作流上的正则表达式过滤器。使用*转换修饰符意味着否定模式没有存储目标；它只是被吃掉了。

#include <stdio.h>
#include <string.h>

int main(){
    enum { BUF_SZ = 80 };   // buffer size in one place
    char buf[BUF_SZ] = "";
    char fmtfmt[] = "%%%d[A-Za-z]";  // format string for the format string
    char fmt[sizeof(fmtfmt + 3)];    // storage for the real format string
    char nfmt[] = "%*[^A-Za-z]";     // negated pattern

    char *p = buf;                               // initialize the pointer
    sprintf(fmt, fmtfmt, BUF_SZ - strlen(buf));  // initialize the format string
    //printf("%s",fmt);
    while( scanf(fmt,p) != EOF                   // scan for format into buffer via pointer
        && scanf(nfmt) != EOF){                  // scan for negated format
        p += strlen(p);                          // adjust pointer
        sprintf(fmt, fmtfmt, BUF_SZ - strlen(buf));   // adjust format string (re-init)
    }
    printf("%s\n",buf);
    return 0;
}

score 0 · Accepted Answer

我正在做一个类似的项目，所以你得心应手！把这个词分成不同的部分。

空格不是每个单词 cin 的问题您可以使用

 if( !isPunct(x) )

将索引增加 1，并将该新字符串添加到临时字符串持有者。您可以像数组一样选择字符串中的字符，因此查找那些非字母字符并存储新字符串很容易。

 string x = "hell5o"     // loop through until you find a non-alpha & mark that pos
 for( i = 0; i <= pos-1; i++ )
                                    // store the different parts of the string
 string tempLeft = ...    // make loops up to and after the position of non-alpha character
 string tempRight = ...

c - 从 C 中的 fscanf 字符串中删除特殊字符

5 回答 5

Related

Reference