The Digitization of Everything

Computers require and drive the transformation of knowledge to binary form. New knowledge and new technologies require new ways of representing information. Already there are countless formats for all kinds of information, but they are now overwhelmingly stored and transmitted in binary form. Encoding data is an important part of the theory and practice of information science.

Below is a list of various kinds on information and ways the information has been digitized or encoded in to binary---zeroes and ones. The binary information can often be visualized in more than one way. But it is all ultimately zeroes and ones. Zeroes and ones can be:

  1. cheaply represented in small physical devices like transistors,
  2. transmitted, stored, and copied without loss,
  3. used as input data to computer programs, and
  4. subject to mathematical analysis.

Counts tally notation tally
notation
unary numeral system for counting
Symbols symbols elements of a set can be represented by their ordinal position
Numbers 1, 2, 3 positional number
system
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ... (decimal)
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10, 11, ... (hex)
0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111, 10000, 10001, ... (binary)
Integers Z positive and negative numbers signed magnitude, two's complement, or zigzag: 0, -1, 1, -2, 2, -3, 3
Rationals Q quotients ½ ⅓ ¾ 1011, 10111, 11101111 ('0' is like a comma and tally notation); alternatively 21·32, 21· 33, 23·34 (Gödel numbering)
Reals R IEEE 754 Many real numbers (rationals really) can be approximated using a fixed number of bits.
Infinity bit pattern of IEEE 754 single precision (32 bits) 7F 80 00 00 (hex)
0 11111111 00000000000000000000000 (binary)
Pixel/Color color squares α R G B Example:
Cornflower Blue
006495ED (hex),
0000 0000 0110 0010 1001 0101 1110 1101 (binary)
Pictures FIT clock tower JPG Look at sample JPG file. 851px x 1134px, 68kb; View the bits in hex
Sound WAV Play sample WAV file
a couple of seconds, 12.9kb
Visualization as waveform
as bits in hex
Music MIDI Listen MIDI file
about 2:40 minutes, 18.8kb
Visualization as orchestral score
view bits in hex
Video MP4 Watch a sample MP4 file about 2:23 minutes, 48Mb is too many bits to show.
Plain Text US
ASCII
Read sample TXT file
about 200 lines, 11kb
View in individual characters,
in hexadecimal, or in binary
Tuples of
k things
Z curve Z curve See Z-order curve
Varying length
lists
lists k-adic notation Lists or sequences of k things; for example k=26: ε, A, ... Z, AA, ... ZZ, AAA, ...
DNA condo combinations of for three nucleotides: T,C,G, and A. Of the 64 codons, 61 represent amino acides, and three are stop signals. For example GCC, encodes the amino acid Alanine, and CAG encodes glutamine.
Printable
documents
PDF Read sample PDF file
3 pages from Blown to Bits
View 76kb as hex
Documents
for the WWW
HTML5 Read sample HTML file
HTML is structured text
As text with syntax highlighting,
view bits in hex
WWW model
(hypertext)
[ad hoc] As graph,
adjacency list
As characters; view in hex, or binary
Bit Coin
[ad hoc] The BitCoin genesis block in hex and genesis block in hex dump (hd)
Non-fungible
token
Computer
instructions
ELF/x86 Mnemonic assembly instructions View bits in hex
Turing
Machines
Turing Machine ad hoc
Combinators I=(℩℩)
K=(℩(℩℩))
S=(℩(℩(℩℩)))
Iota 0011011 denotes ((℩℩)(℩℩)) and 0101011 denotes (℩(℩(℩℩))). See Iota and Jot esoteric programming languages
Source
programs
Java,
Python, etc.
See below
Formal Proofs miracle coq Coq See below
Extraterrestrial
communications
?? See explanation of Arecibo message

Computer Programs

Computer programs (source code) should not be overlooked as one of the very important forms of digital information. As an example, we look in greater detail at Java, Python, and C++ programs. Despite the high-level, intellectual structures in a computer program, all computer programs are represented in zeroes and ones.
binary hexadecimal characters plain text with meta info syntax highlighting
Hello.java Hello.java Hello.java Hello.java Hello.java Hello.java
hello.py hello.py hello.py hello.py hello.py hello.py
hello.cpp hello.cpp hello.cpp hello.cpp hello.cpp hello.cpp
C++, Java, and Python programs are text programs — text program with very particular structure. They are also data — input to a compiler which translates them into an equivalent data form which may be directly executed by a real machine.

Formal Mathematical Proof

An example in Coq.
binary hexadecimal characters plain text (Latin1) with meta info syntax highlighting LaTeX formatted Mathematical proof
group.v group.v group.v group.v group.v group.v group.v group.v
cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v
curry.v curry.v curry.v curry.v curry.v curry.v curry.v curry.v
Coq proofs are data -- input to the Coq system which verifies that the proof is correct. Other formal proof systems include HOL Light, Mizar, ProofPower, and Isabell (Isar formal language).

Interpretation

In order to use data it is necessary to know how the bits are to be interpreted.

Interpretation of a Word

The same representation in the computer (bits) can mean different things. Take, for example, the 32 bits
0x9207BFF0 = 1001 0010 0000 0111 1011 1111 1111 0000
  1. Twos complement: -1,844,985,872
  2. Unsigned integer (binary number): 2,449,981,424
  3. IEEE 754 floating-point: -4.283507E-28
  4. Instruction for SPARC computer: add %fp, -16, %o1
  5. alpha RGB: transparent blue 1001 0010 0000 0111 1011 1111 1111 0000
  6. Latin-1 string of characters: ? ␇ ¿ ð (PU2, bell control code, inverted questions mark, lowercase eth)
  7. Latin-3 string of characters: ? ␇ ż - (PU2, bell control code, z with dot above, unused)
  8. Latin-7 string of characters: ? ␇ Ώ π (PU2, bell control code, omega with tonos, small pi)
?: The ISO control code for private use character number 2; it has no printable presentation.

Binary versus Text File

One must be careful with numbers in a file. Does the file contain digits as characters or does it represent numbers as raw bits. Consider a binary file:

binary hexadecimal characters
data file data file data file

Compare the following two programs which interpret the same file differently.

The first program prints:

  859,190,816
1,684,109,683
  543,780,384
1,629,514,853
1,634,738,297
1,700,885,002
because it interprets each word as an integer.

The scond program prints:

366
Because the US-ASCII characters 366 form the first and last block of digits in the file.

 

 

digital museum