13

How do type casting happen without loss of data inside the compiler?

For example:

 int i = 10;
 UINT k = (UINT) k;

 float fl = 10.123;
 UINT  ufl = (UINT) fl; // data loss here?

 char *p = "Stackoverflow Rocks";
 unsigned char *up = (unsigned char *) p;

How does the compiler handle this type of typecasting? A low-level example showing the bits would be highly appreciated.

4

4 回答 4

20

Well, first note that a cast is an explicit request to convert a value of one type to a value of another type. A cast will also always produce a new object, which is a temporary returned by the cast operator. Casting to a reference type, however, will not create a new object. The object referenced by the value is reinterpreted as a reference of a different type.

Now to your question. Note that there are two major types of conversions:

  • Promotions: This type can be thought of casting from a possibly more narrow type to a wider type. Casting from char to int, short to int, float to double are all promotions.
  • Conversions: These allow casting from long to int, int to unsigned int and so forth. They can in principle cause loss of information. There are rules for what happens if you assign a -1 to an unsigned typed object for example. In some cases, a wrong conversion can result in undefined behavior. If you assign a double larger than what a float can store to a float, the behavior is not defined.

Let's look at your casts:

int i = 10; 
unsigned int k = (unsigned int) i; // :1

float fl = 10.123;
unsigned int  ufl = (unsigned int) fl; // :2

char *p = "Stackoverflow Rocks"; 
unsigned char *up = (unsigned char *) p; // :3
  1. This cast causes a conversion to happen. No loss of data happens, since 10 is guaranteed to be stored by an unsigned int. If the integer were negative, the value would basically wrap around the maximal value of an unsigned int (see 4.7/2).
  2. The value 10.123 is truncated to 10. Here, it does cause lost of information, obviously. As 10 fits into an unsigned int, the behavior is defined.
  3. This actually requires more attention. First, there is a deprecated conversion from a string literal to char*. But let's ignore that here. (see here). More importantly, what does happen if you cast to an unsigned type? Actually, the result of that is unspecified per 5.2.10/7 (note the semantics of that cast is the same as using reinterpret_cast in this case, since that is the only C++ cast being able to do that):

A pointer to an object can be explicitly converted to a pointer to an object of different type. Except that converting an rvalue of type “pointer to T1” to the type "pointer to T2" (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.

So you are only safe to use the pointer after you cast back to char * again.

于 2008-12-04T17:01:20.770 回答
8

The two C-style casts in your example are different kinds of cast. In C++, you'd normally write them

unsigned int uf1 = static_cast<unsigned int>(fl);

and

unsigned char* up = reinterpret_cast<unsigned char*>(p);

The first performs an arithmetic cast, which truncates the floating point number, so there is data loss.

The second makes no changes to data - it just instructs the compiler to treat the pointer as a different type. Care needs to be taken with this kind of cast: it can be very dangerous.

于 2008-12-04T12:44:32.207 回答
5

"Type" in C and C++ is a property assigned to variables when they're handled in the compiler. The property doesn't exist at runtime anymore, except for virtual functions/RTTI in C++.

The compiler uses the type of variables to determine a lot of things. For instance, in the assignment of a float to an int, it will know that it needs to convert. Both types are probably 32 bits, but with different meanings. It's likely that the CPU has an instruction, but otherwise the compiler would know to call a conversion function. I.e. & __stack[4] = float_to_int_bits(& __stack[0])

The conversion from char* to unsigned char* is even simpeler. That is just a different label. At bit level, p and up are identical. The compiler just needs to remember that *p requires sign-extension while *up does not.

于 2008-12-04T12:31:15.013 回答
1

Casts mean different things depending on what they are. They can just be renamings of a data type, with no change in the bits represented (most casts between integral types and pointers are like this), or conversions that don't even preserve length (such as between double and int on most compilers). In many cases, the meaning of a cast is simply unspecified, meaning the compiler has to do something reasonable but doesn't have to document exactly what.

A cast doesn't even need to result in a usable value. Something like char * cp; float * fp; cp = malloc(100); fp = (float *)(cp + 1); will almost certainly result in a misaligned pointer to float, which will crash the program on some systems if the program attempts to use it.

于 2008-12-04T17:45:17.140 回答