3

我绞尽脑汁想弄清楚为什么这段代码没有得到正确的结果。我正在寻找浮点正和负溢出/下溢级别的十六进制表示。该代码基于此站点和Wikipedia 条目

7f7f ffff ≈ 3.4028234 × 10 38 (最大单精度)——来自维基百科条目,对应于正溢出

这是代码:

#include <iostream>
#include <cstdio>
#include <cstdlib>
#include <cmath>

using namespace std;

int main(void) {

    float two = 2;
    float twentyThree = 23;
    float one27 = 127;
    float one49 = 149;


    float posOverflow, negOverflow, posUnderflow, negUnderflow;

    posOverflow = two - (pow(two, -twentyThree) * pow(two, one27));
    negOverflow = -(two - (pow(two, one27) * pow(two, one27)));


    negUnderflow = -pow(two, -one49);
    posUnderflow = pow(two, -one49);


    cout << "Positive overflow occurs when value greater than: " << hex << *(int*)&posOverflow << endl;


    cout << "Neg overflow occurs when value less than: " << hex << *(int*)&negOverflow << endl;


    cout << "Positive underflow occurs when value greater than: " << hex << *(int*)&posUnderflow << endl;


    cout << "Neg overflow occurs when value greater than: " << hex << *(int*)&negUnderflow << endl;

}

输出是:

当值大于时发生正溢出:f3800000 当值小于时发生负溢出7f800000 :当值大于时发生正下溢:当值大于1 时发生负溢出:80000001

To get the hexadecimal representation of the floating point, I am using a method described here:

Why isn't the code working? I know it'll work if positive overflow = 7f7f ffff.

4

3 回答 3

3

Your expression for the highest representable positive float is wrong. The page you linked uses (2-pow(2, -23)) * pow(2, 127), and you have 2 - (pow(2, -23) * pow(2, 127)). Similarly for the smallest representable negative float.

Your underflow expressions look correct, however, and so do the hexadecimal outputs for them.

Note that posOverflow and negOverflow are simply +FLT_MAX and -FLT_MAX. But note that your posUnderflow and negUnderflow are actually smaller than FLT_MIN(because they are denormal, and FLT_MIN is the smallest positive normal float).

于 2013-02-18T03:23:28.793 回答
2

Floating point loses precision as the number gets bigger. A number of the magnitude 2127 does not change when you add 2 to it.

Other than that, I'm not really following your code. Using words to spell out numbers makes it hard for me to read.

Here is the standard way to get the floating-point limits of your machine:

#include <limits>
#include <iostream>
#include <iomanip>

std::ostream &show_float( std::ostream &s, float f ) {
    s << f << " = ";
    std::ostream s_hex( s.rdbuf() );
    s_hex << std::hex << std::setfill( '0' );
    for ( char const *c = reinterpret_cast< char const * >( & f );
          c != reinterpret_cast< char const * >( & f + 1 );
          ++ c ) {
        s_hex << std::setw( 2 ) << ( static_cast< unsigned int >( * c ) & 0xff );
    }
    return s;
}

int main() {
    std::cout << std::hex;
    std::cout << "Positive overflow occurs when value greater than: ";
    show_float( std::cout, std::numeric_limits< float >::max() ) << '\n';
    std::cout << "Neg overflow occurs when value less than: ";
    show_float( std::cout, - std::numeric_limits< float >::max() ) << '\n';
    std::cout << "Positive underflow occurs when value less than: ";
    show_float( std::cout, std::numeric_limits< float >::denormal_min() ) << '\n';
    std::cout << "Neg underflow occurs when value greater than: ";
    show_float( std::cout, - std::numeric_limits< float >::min() ) << '\n';
}

output:

Positive overflow occurs when value greater than: 3.40282e+38 = ffff7f7f
Neg overflow occurs when value less than: -3.40282e+38 = ffff7fff
Positive underflow occurs when value less than: 1.17549e-38 = 00008000
Neg underflow occurs when value greater than: -1.17549e-38 = 00008080

The output depends on the endianness of the machine. Here the bytes are reversed due to little-endian order.

Note, "underflow" in this case isn't a catastrophic zero result, but just denormalization which gradually reduces precision. (It may be catastrophic to performance, though.) You might also check numeric_limits< float >::denorm_min() which produces 1.4013e-45 = 01000000.

于 2013-02-18T03:30:36.853 回答
1

Your code assumes integers have the same size as a float (so do all but a few of the posts on the page you've linked, btw.) You probably want something along the lines of:

for (size_t s = 0; s < sizeof(myVar); ++s) {
    unsigned char *byte = reinterpret_cast<unsigned char*>(myVar)[s];
    //sth byte is byte
}

that is, something akin to the templated code on that page.

Your compiler may not be using those specific IEEE 754 types. You'll need to check its documentation.

Also, consider using std::numeric_limits<float>.min()/max() or cfloat FLT_ constants for determining some of those values.

于 2013-02-18T03:03:12.327 回答