Bag programming language

This page about the revised Bag programming language is under construction (as of June 28, 2007).

Bag is derived from John Conway's Fractran game/language, adding syntax and side effects to make it a "practical" programming language (although still an esoteric one), while being backwards compatible with it, and perhaps allowing a semi-efficient implementation.

The power, indeed Turing completeness of Fractran comes from viewing a natural number as the product of a bag (multiset) of prime factors, where the order of primes is immaterial but the number of each is important. The name of Bag is derived from this.

Basic concept

A program in Fractran consists of a comma-separated list of positive fractions. To run the program, start with some natural number n and perform the following:
  1. Select the first fraction from the list.
  2. If n times the current fraction is an integer, replace n by this product and go to step 1.
  3. Otherwise, if there are more fractions on the list, select the next one and go to step 2.
  4. Otherwise, the program ends with the final n as the result.
The above process is actually Turing-complete, by using an encoding of data as the powers of primes, you can find a list of fractions that calculates any computable result as another prime power.

For example, if you let n=2x3y, then the below program will end with n=5xy, effectively calculating the product of x and y.

385/13, 13/21, 1/7, 3/11, 7/2, 1/3
To see this, let us factorize the list into primes:
7 11 5/13, 13/7 3, 1/7, 3/11, 7/2, 1/3
(In Bag syntax multiplication is usually expressed simply as concatenation, even with numerals.)

If we start with say 36 = 22 32, then the first fraction to apply is 7/2, giving 2 32 7. Then the two first fractions "cooperate" repeatedly to turn this into 2 52 7 112. Then the 1/7 removes the 7, giving 2 52 112, and the 3/11 repeatedly applies to give 2 32 52.

To sum up so far, a factor 2 was removed while simultaneously adding as many factors 5 as there were factors 3 originally. If we write a number as 2x 3y 5z, this corresponds to decrementing x and adding y to z. Note that z+xy is preserved by this.

Now the process repeats with another factor 2, until x becomes 0 and z has become the original xy. Finally the fraction 1/3 applies repeatedly to set y=0, giving just 5xy as the final number.

Note that in this argument, the prime factorization of fractions and numbers is important, but which primes are used does not matter. The algorithm would work just as well if we replaced every 2 by 31 and every 7 by 43, say.

Bag allows you to write numbers like in Fractran if you want to (both versions of the list above are legal Bag programs), but it also allows you to use a symbolic representation, where you can give names to the primes used, possibly with intuitive meanings. For example the above list can be written symbolically as

tx ty z/tz,
tz/tx y,
/tx,
y/ty,
tx/x,
/y

Syntax

The syntax of Bag has been revised in order to allow freely mixing symbolic and numeric representations.

Comments begin with the character # and last until the following newline, or begin with the characters <# and last until the matching #>. The latter form may be nested. Whitespace consists of any number of space, newline or tab characters. Comments and whitespace are ignored outside strings, except that they separate other tokens.

Numerals may be given in base 8 (with prefix 0o or 0O), 10 (no prefix) or 16 (with prefix 0x or 0X). They represent natural numbers.

Strings begin and end with the character ". Inside a string, the character \ starts an escape sequence. The following escape sequences are defined:
\\a literal \
\"a literal "
\nnewline character
\ttab character
\gap\ignore a gap of whitespace and comments
\numeralcharacter with code numeral
A string represents a natural number in base "\0\1", with the first character giving the least significant digit. As a result, trailing "\0" characters in a string have no effect. The number "\0\1" is implementation-dependent, but is larger than any single character.

Prime names begin with a letter or underscore and may contain letters, digits, and underscores. Prime names are case sensitive, and each name represents an arbitrary prime which is not a factor of any other number in the program.

Operators are the usual arithmetical +-*/, parentheses () and exponentiation ^. The * of multiplication is usually omitted like in mathematical notation, while / has a slightly lower precedence, optional first argument, and is non-associative (may not be nested without using parentheses). See the table below.

Commas are used to separate fractions at the top level of the program. Angle brackets are used around a module's export/import list.

Syntax elements of Bag expressions in decreasing order of precedence
Notation Precedence/
Associativity
Meaning Notes
numeralnot applicablenatural number
stringonly in exponents
prime namearbitrary primenot in exponents
()bracketingparentheses
^right assocexponentiationall sub-expressions of second argument must be integers
* or noneassocmultiplicationdoes not reduce result to lowest terms outside exponents
/nonassoc/
prefix
division
-prefixunary minusonly in exponents
+left assocaddition
-subtraction

See also