c++ - 在读取循环中区分失败和文件结尾

Question

从 istream 读取的惯用循环是

while (thestream >> value)
{
  // do something with value
}

现在这个循环有一个问题：它不会区分循环是由于文件结束还是由于错误而终止。例如，采取以下测试程序：

#include <iostream>
#include <sstream>

void readbools(std::istream& is)
{
  bool b;
  while (is >> b)
  {
    std::cout << (b ? "T" : "F");
  }
  std::cout << " - " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread(std::string s)
{
  std::istringstream is(s);
  is >> std::boolalpha;
  readbools(is);
}

int main()
{
  testread("true false");
  testread("true false tr");
}

第一次调用testread包含两个有效的布尔值，因此不是错误。第二次调用以第三个不完整的布尔值结束，因此是一个错误。然而，两者的行为是相同的。在第一种情况下，读取布尔值失败，因为没有，而在第二种情况下，它失败，因为它不完整，并且在两种情况下都命中了 EOF。实际上，上面的程序输出了两次相同的行：

TF - 0110
TF - 0110

为了解决这个问题，我想到了以下解决方案：

while (thestream >> std::ws && !thestream.eof() && thestream >> value)
{
  // do something with value
}

这个想法是在实际尝试提取值之前检测常规 EOF。因为文件末尾可能有空格（这不会是错误，但会导致读取最后一项没有命中 EOF），所以我首先丢弃所有空格（不会失败），然后测试 EOF。只有当我不在文件末尾时，我才会尝试读取该值。

对于我的示例程序，它似乎确实有效，我得到了

TF - 0100
TF - 0110

所以在第一种情况下（正确输入），fail()返回 false。

现在我的问题是：这个解决方案是否保证有效，或者我只是（不）幸运地碰巧给出了预期的结果？另外：是否有更简单的（或者，如果我的解决方案是错误的，是正确的）方法来获得所需的结果？

score 8 · Accepted Answer

只要您不将流配置为使用异常，就很容易区分 EOF 和其他错误。

只需stream.eof()在最后检查。

在此之前只检查失败/非失败，例如stream.fail()or !stream。注意good不是相反的fail。所以一般从不看good，只看fail。

编辑：

一些示例代码，即修改您的示例以区分数据中的不良 bool 规范：

#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>
using namespace std;

bool throwX( string const& s )  { throw runtime_error( s ); }
bool hopefully( bool v )        { return v; }

bool boolFrom( string const& s )
{
    istringstream stream( s );
    (stream >> boolalpha)
        || throwX( "boolFrom: failed to set boolalpha mode." );

    bool result;
    (stream >> result)
        || throwX( "boolFrom: failed to extract 'bool' value." );
        
    char c;  stream >> c;
    hopefully( stream.eof() )
        || throwX( "boolFrom: found extra characters at end." );
    
    return result;
}

void readbools( istream& is )
{
    string word;
    while( is >> word )
    {
        try
        {
            bool const b = boolFrom( word );
            cout << (b ? "T" : "F") << endl;
        }
        catch( exception const& x )
        {
            cerr << "!" << x.what() << endl;
        }
    }
    cout << "- " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread( string const& s )
{
    istringstream is( s );
    readbools( is );
}

int main()
{
  cout << string( 60, '-' ) << endl;
  testread( "true false" );

  cout << string( 60, '-' ) << endl;
  testread( "true false tr" );

  cout << string( 60, '-' ) << endl;
  testread( "true false truex" );
}

示例结果：

-------------------------------------------------- ----------
吨
F
- 0110
-------------------------------------------------- ----------
吨
F
!boolFrom: 未能提取 'bool' 值。
- 0110
-------------------------------------------------- ----------
吨
F
!boolFrom: 在末尾发现多余的字符。
- 0110

编辑 2：在发布的代码和结果中，添加了使用eof()检查的示例，我忘记了。

编辑 3：以下相应示例使用 OP 提出的 skip-whitespace-before-reading 解决方案：

#include <iostream>
#include <sstream>
#include <string>
using namespace std;

void readbools( istream& is )
{
    bool b;
    while( is >> ws && !is.eof() && is >> b )       // <- Proposed scheme.
    {
        cout << (b ? "T" : "F") << endl;
    }
    if( is.fail() )
    {
        cerr << "!readbools: failed to extract 'bool' value." << endl;
    }
    cout << "- " << is.good() << is.eof() << is.fail() << is.bad() << "\n";
}

void testread( string const& s )
{
    istringstream is( s );
    is >> boolalpha;
    readbools( is );
}

int main()
{
  cout << string( 60, '-' ) << endl;
  testread( "true false" );

  cout << string( 60, '-' ) << endl;
  testread( "true false tr" );

  cout << string( 60, '-' ) << endl;
  testread( "true false truex" );
}

示例结果：

-------------------------------------------------- ----------
吨
F
- 0100
-------------------------------------------------- ----------
吨
F
!readbools: 未能提取 'bool' 值。
- 0110
-------------------------------------------------- ----------
吨
F
吨
!readbools: 未能提取 'bool' 值。
- 0010

主要区别在于，这种方法在第三种情况下会产生 3 个成功读取的值，即使第三个值指定不正确（如"truex"）。

即它无法识别这样的不正确规范。

当然，我编写 Code That Does Not Work™ 的能力并不能证明它不能工作。但是我相当擅长编写代码，并且我看不到任何方法来检测"truex"不正确，使用这种方法（虽然使用基于读取单词异常的方法很容易做到）。所以至少对我来说，基于读取词异常的方法更简单，因为它很容易使其行为正确。

c++ - 在读取循环中区分失败和文件结尾

1 回答 1

Related

Reference