Due to random debugging at work, I have read RFC 1952, aka GZIP file format specification version 4.3.
I thought GZIP was just a compression-only file format. It would be some magic bytes + a compressed byte stream. But it is not.
First, it could have the original file name, and macOS’s gzip (Apple gzip 287.100.2) by default includes that.
% touch hello
% gzip hello
% hexdump -C hello.gz
00000000 1f 8b 08 08 36 bc c9 5f 00 03 68 65 6c 6c 6f 00 |....6.._..hello.|
00000010 03 00 00 00 00 00 00 00 00 00 |..........|
0000001a
%
Second, it has one byte for OS. Note that the gzip file above claims “Unix”, even my OS is macOS. Well, macOS is one of the UNIX certified products…
OS (Operating System)
This identifies the type of file system on which compression
took place. This may be useful in determining end-of-line
convention for text files. The currently defined values are
as follows:
0 - FAT filesystem (MS-DOS, OS/2, NT/Win32)
1 - Amiga
2 - VMS (or OpenVMS)
3 - Unix
4 - VM/CMS
5 - Atari TOS
6 - HPFS filesystem (OS/2, NT)
7 - Macintosh
8 - Z-System
9 - CP/M
10 - TOPS-20
11 - NTFS filesystem (NT)
12 - QDOS
13 - Acorn RISCOS
255 - unknown
Not all GZIP implementations honor the OS byte. For example, Go’s compress/gzip always uses 255.
Source: posts/2020/gzip.md