I can't see right now why it's not enough to just take the upper byte, i.e. discard the lower 8 bits of each sample.
That of course assumes that the samples are linear; if they're not then maybe you need to do something to linearize them before dropping bits.
short sixteenBit = 0xfeed;
byte eightBit = sixteenBit >> 8;
// eightBit is now 0xfe.
As suggested by AShelly in a comment, it might be a good idea to round, i.e. add 1 if the byte we're discarding is higher than half its maximum:
eightBit += eightBit < 0xff && ((sixteenBit & 0xff) > 0x80);
The test against 0xff implements clamping, so we don't risk adding 1 to 0xff and wrapping that to 0x00 which would be bad.