I like the way these magic bytes are stored here. It’s a lot less esoteric than usual:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const array<uint8_t, 0xE> magicBytes{
0x89,
static_cast<uint8_t>('M'),
static_cast<uint8_t>('O'),
static_cast<uint8_t>('C'),
static_cast<uint8_t>('H'),
static_cast<uint8_t>('I'),
static_cast<uint8_t>('P'),
static_cast<uint8_t>('K'),
static_cast<uint8_t>('G'),
static_cast<uint8_t>('\r'),
static_cast<uint8_t>('\n'),
0x1A,
static_cast<uint8_t>('\n'),
static_cast<uint8_t>('\0')
};

Instead I could write

1
constexpr auto magicBytes{ "\x89MOCHIPKG\r\n\x1A\n" };

…but that would be a little more confusing, to be honest… even if it’s simpler.

that sort of statement is also a little error-prone – because of the way raw hexadecimal is parsed inside strings, \x can be followed by an indefinite number of hex digits. The only reason it works okay is because M is not a valid hex digit, so parsing of the escape sequence implicitly stops. Yes, everyone agrees this sort of behaviour was probably 0xabad1dea on ISO’s part when it came to standardising C. 😞

The other reason the shorter statement is invisibly problematic, is the implied null termination character. In my code, \0 is explicitly part of the array, as it should be; this isn’t meant to be text. It just happens to contain some ASCII characters mixed in with other binary data. \0 is always added to the end of C-style string literals. If we didn’t want this behaviour, the shorter statement would not even be possible! Instead, we would have to write it as an array of individual char literals.

So unless you already know about all of these nooks and crannies of C and C++, you would be very bemused to find the intended meaning behind a statement like the one-liner above. It’s too precarious, to be frank… and it’s not meant to be a string anyway. Here’s why 😛