[[ASCII C0 / C1 details]]
While developing the application I encountered strange (as it seemed to be at the beginning) behaviour of Python interpreter, which encoded C1 control bytes as two bytes of UTF-8, while C0 control bytes were displayed as sole bytes, like it would have been encoded in a plain ASCII. Then there was a bit of researching done.
According to ISO/IEC 6429 (ECMA-48), there are two types of ASCII control codes (to be precise, much more, but for our purposes it's mostly irrelevant) — C0 and C1. The first one includes ASCII code points 0x00-0x1F and 0x7F (some authors also include a regular space character 0x20 in this list), and the characteristic property of this type is that all C0 code points are encoded in UTF-8 exactly the same as they do in 7-bit US-ASCII (ISO/IEC 646). This helps to disambiguate exactly what type of encoding is used even for broken byte sequences, considering the task is to tell if a byte represents sole code point or is actually a part of multibyte UTF-8 sequence.
However, C1 control codes are represented by 0x80-0x9F bytes, which also are valid bytes for multibyte UTF-8 sequences. In order to distinguish the first type from the second UTF-8 encodes them as two-byte sequences instead (0x80 → 0xC280, etc.); also this applies not only to control codes, but to all other ISO/IEC 8859 code points starting from 0x80.
With this in mind, let's see how the application reflects these differences. First command produces several 8-bit ASCII C1 control codes, which are classified as raw binary/non-UTF-8 data, while the second command's output consists of the very same code points but being encoded in UTF-8 (thanks to Python's full transparent Unicode support, we don't even need to bother much about the encodings and such):
