The Digitization of Everything

Computers require and drive the transformation of knowledge to binary form. New knowledge and new technologies require new ways of representing information. Already there are countless formats for all kinds of information, but they are now overwhelmingly stored and transmitted in binary form. Encoding data is an important part of the theory and practice of information science.

Below is a list of various kinds on information and ways the information has been digitized or encoded in to binary---zeroes and ones. The binary information can often be visualized in more than one way. But it is all ultimately zeroes and ones. Zeroes and ones can be:

  1. cheaply represented in small physical devices like transistors,
  2. transmitted, stored, and copied without loss,
  3. used as input data to computer programs, and
  4. subject to mathematical analysis.

Numbers 1, 2, 3 positional number
system
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ... (decimal) 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, 10, 11, ... (hex)
1, 10, 11, 100, 101, 110, 111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111, 10000, 10001, ... (binary)
Rationals Q quotients ½ ⅓ ¾
Reals R IEEE 754 Many real numbers can be approximated using a fixed number of bits.
Infinity IEEE 754 single precision (32 bits) 7F 80 00 00 (hex)
0 11111111 00000000000000000000000 (binary)
Pixel/Color color squares α R G B Example:
Cornflower Blue
006495ED (hex),
0000 0000 0110 0010 1001 0101 1110 1101 (binary)
Pictures FIT clock tower JPG Look at sample JPG file
851px x 1134px, 68kb
View bits in hex
Sound WAV Play sample WAV file
a couple of seconds, 12.9kb
Visualization as waveform
as bits in hex
Music MIDI Listen MIDI file
about 2:40 minutes, 18.8kb
Visualization as orchestral score
view bits in hex
Video MP4 Watch sample MP4 file
about 2:23 minutes, 48Mb
(bits omitted)
Plain Text US
ASCII
Read sample TXT file
about 200 lines, 11kb
View bits in hex
Lists lists k-adic notation k=26 for example: ε, A, ... Z, AA, ... ZZ, AAA, ...
Printable
documents
PDF Read sample PDF file
3 pages from Blown to Bits,
76kb
View bits in hex
Documents
for the WWW
HTML5 Read sample HTML file
As text with syntax highlighting,
view bits in hex
WWW model
(hypertext)
[ad hoc] As graph,
adjacency list
As characters; view in hex, or binary
Computer
instructions
ELF/x86 Mnemonic assembly instructions View bits in hex
Programs Java See below
Formal Proofs miracle coq Coq See below
Extraterrestrial
communications
?? See explanation of Arecibo message

Computer Programs

Computer programs (source code) should not be overlooked as one of the very important forms of digital information. As an example, we look in greater detail at Java programs. Despite the high-level, intellectual structures in a computer program, all computer programs are represented in zeroes and ones.
binary hexadecimal characters plain text with meta info syntax highlighting
hello hello hello hello hello hello
Java programs are data -- input to a Java compiler which translates them into an equivalent data form which may be directly executed by a real machine.

Formal Mathematical Proof

An example in Coq.
binary hexadecimal characters plain text (Latin1) with meta info syntax highlighting LaTeX formatted Mathematical proof
group.v group.v group.v group.v group.v group.v group.v group.v
cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v cantor.v
Coq proofs are data -- input to the Coq system which verifies that the proof is correct. Other formal proof systems include HOL Light, Mizar, ProofPower, and Isabell (Isar formal language).

Interpretation

In order to use data it is necessary to know how the bits are to be interpreted.

Interpretation of a Word

The same representation in the computer (bits) can mean different things. Take, for example, the 32 bits
0x9207BFF0 = 1001 0010 0000 0111 1011 1111 1111 0000
  1. Twos complement: -1,844,985,872
  2. Unsigned integer (binary number): 2,449,981,424
  3. IEEE 754 floating-point: -4.283507E-28
  4. Instruction for SPARC computer: add %fp, -16, %o1
  5. alpha RGB: transparent blue 1001 0010 0000 0111 1011 1111 1111 0000
  6. Latin-1 string of characters: ? ␇ ¿ ð (PU2, bell control code, inverted questions mark, lowercase eth)
  7. Latin-3 string of characters: ? ␇ ż - (PU2, bell control code, z with dot above, unused)
  8. Latin-7 string of characters: ? ␇ Ώ π (PU2, bell control code, omega with tonos, small pi)
?: The ISO control code for private use character number 2; it has no printable presentation.

Binary versus Text File

One must be careful with numbers in a file. Does the file contain digits as characters or does it represent numbers as raw bits. Consider a binary file:
binary hexadecimal characters
data file data file data file

Compare the following two Java programs which interpret the same file differently.

The first program prints:

  859,190,816
1,684,109,683
  543,780,384
1,629,514,853
1,634,738,297
1,700,885,002
because it interprets each word as an integer.

The scond program prints:

366
Because the US-ASCII characters 366 form the first and last block of digits in the file.

 

 

digital museum