In the International System of Units there are standard prefixes, based on powers of ten, used to indicate multiplication or division: kilo- to indicate multiplication by one thousand (103), mega- to indicate multiplication by one million (106), giga- to indicate multiplication by one billion (109), and so on.
But computer scientists don’t like powers of ten. The most basic unit of digital storage, the bit, is represented either as a one or a zero (with eight bits making a byte) and thus computer scientists are much happier working in binary, with powers of two rather than powers of ten. Standard binary prefixes do exist: kibi- for 210, mebi- for 220, gibi- for 230, etc.
SI Unit | Size /B | Binary Unit | Size /B |
kilobyte (kB) | 1000 | kibibyte (KiB) | 1024 |
megabyte (MB) | 1 000 000 | mebibyte (MiB) | 1 048 576 |
gigabyte (GB) | 1 000 000 000 | gibibyte (GiB) | 1 073 741 824 |
The problem is that barely anyone uses the standard binary prefixes. During the “kilobyte era”, because 1000 and 1024 aren’t much different (2.4%) the difference was mostly ignored. But as file and hard disk drive (HDD) sizes have increased the difference between them has become more noticeable.
HDD manufacturers have stuck with SI (10x) sizes whilst operating systems calculate sizes in binary, but incorrectly use SI prefixes. A 256 gigabyte hard drive (i.e. one containing 256 billion bytes) will be reported by an operating system as being only 238 GB in size, a 6.9% difference. As HDDs becomes ever larger the problem will get worse: at the terabyte level the difference is 9.1% and at the petabyte level it is 11.2%.
Persuading operating systems to alter the way they report file sizes, thereby confusing users in the process, is unlikely to be a successful approach. A far better approach would be to persuade HDD manufacturers to change their marketing so that users purchasing a HDD receive the size they are expecting.*
* Though obviously, as a physicist, it causes me great mental anguish to abuse SI units in this fashion!
I’ve seen worse: 1GB being (mis)calculated as 1e6*1024 bytes, in a script to create disk partitions.
Additionally, more confusion arises when talking about transfer speeds. When and operating systems reports sizes in Megabytes, what’s usually meant is 1,048,576 bytes. However, when the same operating system reports the transfer rate associated with a network interface, that will usually be reported in Megabits or Gigabits per second (Mbps/Gbps), and in this case the transfer rate will mean 1 million (or 1 billion) bits per second.
thanks for the post. I created an 8th grade lesson around this idea. If it works well, I will definitely share it!