Culturedataresearch

Project Sigma

© 2003

Precis

Sigma aims to demonstrate the human scale of infinity.

Its aim is to make human infinity a tangible concept and to allow its exploration, ultimately showing its diversity as an opposition to the general trends to reductionism and globalisation.

Sigma is the sum of all perceivable knowledge in a single, known dataverse.

It reflects the progress of humanity to analyse, store and measure everything around them.

It demonstrates that technology has already reached a point where all that can be measured, can be measured.

Whereas the search for the Grand Unification Theory seeks to unite all of science into a single reductionist theory, Sigma pre-renders all possible outcomes of the known universe into a single point and thus demonstrates the finite and infinite nature of the universe.

  • Number of protons in the universe: 10^17
  • number of synapses in the brain: 5^11
  • the combined number of synapses on earth: 5^11 * 6×10^9 = 3×10^21
  • the combined number of synaptic events per annum for Earth (avg. 10 per second) = 10^30
    • and for the entire history of “thought” = 10^30 * 5,000 years = 5*10^35
  • number of web pages = 3×10^8
  • number of grains of sand on earth: 10^12
  • number of CDs produced = 10^7
  • number of books published = 3*10^7
  • number of permutations of an encrypted 128 bit message= 2^128
  • storage required for all music produced = 500TB = 5*10^14 bytes
  • storage per $1000 following Moore’s law (18 months): 0.25TB 2003
  • so, for $2m you can store all the CDs, today.
  • In 10 years: $35,000
  • In 20 years: $500
  • number of words in Sigma::everyWord: 10^100
  • number of images in Sigma::everyImage: 10^150

Things not covered by Sigma.

  • Other infinities. For example, Pi cannot be stored. It may be explored within Sigma to any precision but not stored in its entirety.

Discussion

The nature of the universe is believed to be infinite. That as finer detail is resolved, greater detail is discovered. However, the human species is self-defined by its limitations as a biological organism. It has only stereo vision and hearing. It has monaural verbal communication, a sense of touch, taste and smell. 

Any quantifiable data can be digitised. This includes everything from the concept of love as described in a poem, to the image of an individuals face in 3-dimensions, to the sound of a train, to the structure of DNA.

Using simple arithmetical models, any data pattern can be generated.

It is not, however, a “random” number generator. All output is predicated on some form of deterministic input.

Sigma has two distinct implementations.

  1. The first pre-generates all possible data and stores it in a single place. The aim of which is to demonstrate that human knowledge is quantifiable. The necessity to actually implement this is debatable (and costly), but the principle is solid.
  2. The second is a collection of tools to enable the exploration and navigation of this “ultimate” data space. This may be done without the pre-generation of any data. The data is only generated when a request is made that it be observed. (note there are many parallels between this approach and classical quantum mechanics – e.g. Schrodinger’s Cat). The system exists in all possible states until an observation is made that “collapses” the decision tree at that instant to create an observable result.

Following late 20th and early 21st century trends, it is intended that the entire dataverse and, importantly, its methods of access be patented. Thus, any subset or realisation of any data realisable with the system will be the patent of Sigma, and any future discovery via other means would have to site Sigma as prior art. Since Sigma encapsulates all data, this means anything that can be digitised is patented by Sigma.

Implementation

The implementation of the project is split into phases, each covering the exploration of a specific subset of the dataverse.

Dataverse subset definitions

  1. The written word (everyWord)
  2. The audible sound (everySound)
  3. The static image (everyPicture)
  4. The moving image (everyMovie)
  5. The stereoscopic image (everySpace)

Implementation of the everyWord

everyWord is the generation of every possible combination of 100 words (each of 10 characters or less using a 40 characters alphabet). This is described numerically as

  • 40^1000 combinations

If this is reduced to dictionary words, say 60,000 words, then every 100-word combination can be represented by

  • 60,000^100 combinations

Sigma Definitions

The dataverse is the set of all information that can be stored digitally.

Within the Sigma project, this potentially infinite set is constrained within the constraints of the data being investigated and the human perception of the output data.

The definition of the text/word dataverse is the number of characters in the western alphabet, namely 26 letters, and 13 punctuation marks (space, dash, comma, full stop, semi-colon, question mark, exclamation mark, brackets, quotes and colon).

The definition of the image dataverse is based on a maximum of 1600×1200 pixels with 24-bit colour depth.

The definition of the video dataverse is based on a maximum of a series of images played at up to 30 frames per second.

The definition of the stereo image dataverse is based on a maximum of 1600×1200 pixels with 24-bit colour depth, one for each eye.

The definition of the audio dataverse is based on a maximum of 44.1KHz sample rate with 16-bit resolution in stereo.