QUATRO - An Experimental Programming Language

by Gary J, Shannon

Created Jan. 28, 2017

Last Updated: Jan. 30, 2017

The Origin of QUATRO

QUATRO is a programming language that started out to be an implementation of standard FORTH, but the more I used it, the more deviant it became. I deviated from the FORTH standards, and even from some of the FORTH philosophies. As a result, it was no longer appropriate to call this FORTH, or even "a dialect of FORTH". Instead, I gave it a new name so that FORTH purists will not be offended by my improper use of their language name.

QUATRO is, for now, integer only, using 16-bit integers. Long integers and floating point values might be added in the future, but at the moment, I'm treating it as somewhat on a par with the old Apple II integer BASIC.

Internally, QUATRO works just like FORTH with P-stack, R-stack, and an incrementally compiled dictionary. It's only the interface between the programmer and the internals that has changed. And though it looks very different, the changes are not really radical at all.

If you have been exposed to FORTH and know a little bit about it, QUATRO will be very easy to use. If you are a FORTH guru then QUATRO might well confuse and frustrate you. It's enough different that your FORTH habits may not work here.

By way of an introduction, here is a sample program that should give you the general feel of the language. This is a rewrite of the game aceyducy.bas from the book BASIC Computer Games. Compare the listing of the game in BASIC to the same game in QUATRO, bearing in mind that the QUATRO version is actually a bit more elaborate, with a lot of error checking and input validation that is missing from the original BASIC version. The Quatro version uses a small file of card game related helper functions called cards.f which is included by the main file.

This zip file has the executable file, the core library an debug files in source form, along with the complete AceyDeucy game. No install is necessary. Just unzip and run. After AceyDeucy.f loads, just type "run" to play the game. Or type "dl" to get a dictionary listing.

The Differences

The first, and most noticable change is that some words are "self delimiting", in that they do not have to be separated from other words by whitespace. For example, the double quote in FORTH must be surrounded by spaces so that unnatural constructions like " Hello World" can be replaced by the more natural "Hello World". I've also changed the way alphanumeric characters interact with special characters.

Formally, the parsing rules are:

  1. The numeric characters are 0..9
  2. The alphabetic characters are a..z, A..Z, and the underscore.
  3. The alphanumeric characters are the combination of the two above sets.
  4. Identifiers, whether variables, array, or function names, either start with an alphabetic character and continue with alphanumeric characters, or consist entirely of special characters.
  5. Self-delimiting characters belong to the set ( ) [ ] { } ' " : ; , . and are always parsed as separate from whatever preceeds or follows them.
  6. All other printable non-alphanumeric, non-self-delimiting characters are refered to simply as "special characters".
  7. A boundary between words is recognized when any transition is made from alphanumeric to special characters, or vice versa.

Examples:

     Valid words include:
     
        Print item2 last_character ++ <=
   
     Invalid words include:

        value+        parsed as value +
        first&last    parsed as first & last
        this-is-bad   parsed as this - is - bad

     Valid statements include:

        table[index[i]]    parsed as table [ index [ i ] ]
        :test"hello";      parsed as : test " hello" ;
           (The final double quote is not parsed at all, but is consumed by the " word as
           it builds the string.) 
        abc++              parsed as abc ++
        >=                 parsed as >= since both characters are special
        +test              parsed as + test since + is special and t is alphanumeric.
        
         

Using these rules many traditional FORTH words like C, [CHAR] (DO) (FIND) +LOOP R> /MOD, to mention a few, are no longer valid. Words like 1+, 1- are renamed for what they do, e.g. Incr, Decr.

Identifiers are not case sensitive so that DUP is the same as dup or Dup.

Identifiers can never begin with a numeric digit, therefore, the dictionary is never searched when the current word begins with a number. Numbers are always converted to the base in the system variable base. If base contains 10 the numbers are read and printed as decimal. If base is 16 then input and output is in hexadecimal. The only exception is when the first two characters of a number are 0x in which case the rest of the number is interpreted as a hexadecimal input.

Some Philosophical Differences

FORTH was developed on slow, tiny computers with tiny memories and tiny floppy disks so the emphasis was on making everything short, compact, and fast. One consequence is that FORTH definitions often look like an unreadable stream of comic book cursing. Those constraints have become archaic, and need no longer apply. QUATRO is somewhat generous with identifier length (up to 63 characters) and with whitespace and comments. Short and "clever" FORTH words are avoided for all but the most primative of native functions.

Disk storage using "screens" made sense on tiny floppy disks, but files make more sense for today's computers. Everything to do with screens is discarded from QUATRO. Programs are found in files, and can be loaded from files, which may themselves load other files to any depth of "including". Any programming text editor can be used to prepare the programs. The idea of using an editor written in FORTH and embedded in the FORTH system is rather pointless when good programming editors are readily available.

Some Naming Conventions

Case is ignored, so identifiers can use any case conventions. For consistency, I choose to capitalize the first character of each word in a function name, (e.g. PrintString, GetKey) and capitalize all but the first word in a variable (e.g. prevValue, bufferIndex). Functions which are used internally during compilation are named with underscore as the first character, and written in all caps ( e.g. _LOOP, _LIT).

Functions that perform a test and return a true/false value generally begin with "Is...". The exceptions are the usual comparisons like = >= <>, and so on. Forth words like 0< and 0=, however, cannot mix special characters with alphanumerics, and so are renamed IsNeg and IsZero, for example.

Global and Local variables

Global variables are declared outside the scope of any colon definition. The word Variable declares a global variable, and is used with an initial value:

   
         123 Variable myVar
      

defines the variable "myVar" with an initial value of 123.

Local variables are declared inside the scope of a colon definition, and must come before any other statements in the definition. The word Local declares a local variable. It is not possible to provide an initial value to a local because that variable does not exist until the function is executed, and ceases to exist when the function is left. Even when the function is executing, the local variable does not exist as anything other than an ephemeral offset relative to a fleeting stack frame. The name given the local variable never makes it to the dictionary, and vanishes like a puff of smoke once compilation of that function is done. The syntax is:

         Local xyz
      

which defines the local xyz with an unpredictable initial value. It should be initialized as needed. One way to use local variables is to copy the parameters that were passed to the function:

         : FunctionXY
            local y store  // fetch the second parameter
            local x store  // fetch the first parameter
            ...
         ;
      

Or the equivalent:

         : FunctionXY
            local y to y  // fetch the second parameter
            local x to x  // fetch the first parameter
            ...
         ;
      

And called like this, for example:

         ... x y FunctionXY ...
      

Both locals and globals are used the same way with a "to switch" that governs their behavior.

         123 to myVar  // stores the value 123 in the variable, whether global or local.
         myVar         // without the "to" the value of the variable is fetched and pushed to the stack.
         locate myVar  // with the state word "locate", the address of the variable will be pushed to the stack.
         toff          // turns off the state switch and returns it to the default "fetch value" state.
                       // In practice, this should never be necessary since the switch is managed internally.
      

To support local variables in colon definitions a little overhead is added to the entering and exiting of words that contain them. This amounts to two words _MARK and _FREE, which are implemented in a total of 5 C++ statements. If a particular colon definition has no locals then that definition will not have any overhead. For this reason, local variables should only be used in "meatier" definitions that actually require them. In other words, a definition like:

         : swap
            Local tos
            Local next
            to tos to next
            tos next
         ;
      

make no sense whatsoever. As a rule, a value should be kept on the stack whenever it is reasonable to do so. If, in writing a definition, you find yourself confused by your own stack juggling, then it's probably time to use a local variable. If you, knowing what you have in mind, are confused by your own stack contents then anyone reading the code, and not knowing in advance what you had in mind, will be even more confused.

Arrays and Subscripting

The syntax of subscripting is natural and easy:

         20 Array table
         5 to table[k]
      

Where [ and ] are separate, self-delimiting words.

Nesting subscripts and computed subscripts are also no problem:

         table[ indices[ j ]] to entry
         entry to table[ indices[ j ]]
         entry to table[ x y 10 * + ]
      

Any word defined as an array will, when mentioned, leave its address on the stack. Also interesting to note is that, since the word [ expects to find a memory address, it follows that any memory address can be subscripted. For example, suppose you want to use some absolute memory location as a temporary array. Without even declaring that memory address to be an array you can write something like:

         : sample 10 0 do 0x8000[i] to 0xff00[i] loop ;
      

which looks odd, but behaves as you might expect by fetching 10 consecutive words from memory location 0x8000 to 0x8009 and storing them at memory locations 0xff00 to 0xff09.

This syntax is supported by an internal mini-stack that saves the state of the TO switch as each level of subscripting depth is entered, and restores it as each level is left. It remains for the [ word to save the TO switch state, and for the ] word to restore the state, add the subscript value to the array address and to invoke the usual variable processing word _VAR which honors the TO state in deciding whether to fetch, save, or point to the variable's value.

One hazzard that needs to be avoided is seen in the statements:

         10 Array table
         123 to table
      

This will leave 123 and the address of table on the stack, but will not store anything anywhere in memory. Once declared as an array, a name must be accessed with a subscript. Otherwise is it not actually accessed at all. That said, if you remember that the mention of an array name leaves its address on the stack, you will see that a statement like this will work:

         10 Array table
         table 3 + Store   // identical in effect with:   to table[ 3 ]
      

Strings

A string is nothing more than an array of bytes. Arrays declared with the Array word described above have a 16-bit word as each array cell. An array declared with the word String have cells which are a single byte wide. Other than that, subscripting works the same, bearing in mind that only a single byte will be fetched from or stored to the array cell.

A string can be declared with an initial value, or with a size. The initial value follows the name of the string:

         String myGreeting "Hello World?"
         : SayHello 
            char ! to myGreeting[12] 
            myGreeting PrtStr cr 
         ;
         Prints: Hello World!
         String anotherString 100 // declares an empty string of size 100
      

If you are familiar with C/C++ you might notice what looks like an error in the above code. The question mark in the string is replaced by an exclamation point before printing it. In C/C++ that subscript should be 11, not 12. But in QUATRO, as in FORTH, strings are not zero terminated, but are counted. The first byte of any string is always the length of the string. The visible portion of the string, therefore, is always subscripted starting at 1, rather than at 0.

If a string is declared with a size then that is the maximum number of characters that it can hold without overflowing its allocated space and causing damage to whatever follows it in memory. No run-time checking is performed, so be careful.

Counted strings use 1 byte to hold the size. For this reason, a string cannot be longer than 255 characters in length and still be printable. You may, however, declare a string with a size greater than 255 using the String (name) (size) syntax. The string (or, more accurately, byte array) will not be able to be printed correctly, but you will still be able to access the entire array using subscripts.

If you intend to use a character array as a string with the varioius string functions then the entry at [0] must always be the length of the rest of the string. If you are using the character array as just an array of characters, and do not intend to use it with string functions, then you may use the [0] to hold data. See the card shuffling example where the graphic characters for the four suits are stored in the array suitNames beginning at the zero entry.

Running Quatro

The Dictionary

The dictionary works the same as in FORTH. Words can be added manually from the keyboard, or loaded from a disk file. Once added, a word can be tested interactively from the keyboard, or compiled into another word as a function or subroutine.

Here is a brief look at all the core words in QUATRO so far. A more detailed explanation will be linked to each word in the near future.

Console Command Words: (These may also be included in a colon definition)
Bye Quits QUATRO and closes the console window
Cold Resets everything to cold start condition
DL Dictionary list - List all entries in the dictionary
Forget Removes the named word from the dictionary also removes everything defined after that word was defined.
Load Loads and compiles/interprets the contents of the file named A file may also load another nested file to any reasonable depth (Limited by system memory). This can be used to "include" sub-words to the main word being loaded.
Arithmetic and Binary Logic Words
+n1 n2 -- n1+n25 9 + (14)
-n1 n2 -- n1-n212 7 - (5)
*n1 n2 -- n1*n28 5 * (40)
/n1 n2 -- n1/n225 6 / (4)
DivModn1 n2 -- rem quo25 6 DivMod (4 1)
Modn1 n2 -- mod25 6 Mod (1)
Negn -- -n12 Neg (-12)
Absn -- abs(n)-96 Abs (96)
Signn -- s (returns -1, 0, or +1 depending on the sign of n) -36 Sign (-1)
0 Sign (0)
14 Sign (1)
++Increments the value of the named variable, in place, and leaves its new value on the stack. It ignores the To Switch. For example, the statement
123 to table[ index++ ] will increment index before performing the subscripting.
--Decrements the value of the named variable, in place, and leaves its new value on the stack. It ignores the To Switch. For example, the statement
123 to table[ index-- ] will decrement index before performing the subscripting.
Andn1 n2 -- (logical bitwise AND of n1 and n2)0xC0 0xFF And (0xC0)
Orn1 n2 -- (logical bitwise OR of n1 and n2)0xC0 0x01 Or (0xC1)
Xorn1 n2 -- (logical bitwise EXCLUSIVE OR of n1 and n2)0xF0 0xFF Xor (0x0F)
Notn -- (Bitwise complement of n)0xF1 Not (0x0E)
Incrn -- n+127 Incr (28)
Incr2n -- n+219 Incr2 (21)
>>n c -- n (shifted right by c bits)0x86 4 >> (0x08)
<<n c -- n (shifted left by c bits)0x79 4 << (0x90)
Randmax -- rand (returns random int < max)100 Rand (???)
Stack Manipulation Words
CopyR-- r (Push a copy of the top of the R-stack)
Dropn --1 2 3 Drop (1 2)
Dupn -- n n1 2 3 Dup (1 2 3 3)
Overn1 n2 -- n1 n2 n11 2 3 Over (1 2 3 2)
PopR-- r (pops the value off the R-stack and pushes it)
PushRr -- (pushes the top stack value to the R-stack)
Rotn1 n2 n3 -- n2 n3 n1 (shuffles the third stack entry to the top)1 2 3 Rot (2 3 1)
Swapn1 n2 -- n2 n1 (swaps the top two stack entries)1 2 3 Swap (1 3 2)
UnRotn1 n2 n3 -- n3 n1 n2 (The opposite of Rot)1 2 3 UnRot (3 1 2)
System Variables
basethe number base used to do input and output conversion. defaults to decimal
dtoppoints to the newest dictionary entry, which links to the rest of the dictionary
herepoints to the first available byte after the end of the dictionary. This is where new dictionary entries are built. Then "here" is adjusted accordingly.
promptholds the character used as a keyboard prompt. Defaults to ">" but can be set by the user to any single character.
stateIs true when the interpretor is in compilation mode. Words typed in or loaded from a file are compiled into the dictionary rather than being executed directly.
Comparison Words (TRUE is -1, FALSE is 0)
<n1 n2 -- t/f (True if n1 < n2)12 31 < (T)
31 12 < (F)
<=n1 n2 -- t/f (True if n1 <= n2)6 12 <= (T)
12 12 <= (T)
14 12 <= (F)
<>n1 n2 -- t/f (True is n1 not equal n2)73 73 <> (F)
73 82 <> (T)
=n1 n2 -- t/f (True if n1 = n2) 2532 1891 = (F)
451 451 = (T)
>n1 n2 -- t/f (True if n1 > n2)99 4 > (T)
88 88 > (F)
>=n1 n2 -- t/f (True if n1 >= n2)99 4 > (T)
88 88 > (T)
16 184 >= (F)
StrEqus1 s2 -- t/f (True if string pointed to by s1 equals string pointed to by s2) "Hello" "Goodbye" StrEqu (F)
"Boo" "Boo" StrEqu (T)
StrCmps1 s2 -- -1/0/+1 (-1 if s1s2) "abc" "def" StrCmp (1)
"def" "abc" StrCmp (-1)
"xyz" xyz" StrCmp (0)
IsNegn -- t/f (True if n is negative)-46 IsNeg (T)
92 IsNeg (F)
IsZeron -- t/f (True if n is zero)367 IsZero (F)
Input-Output Words
BackSpprints 1 backspace control character
GetKeyReads the next key from the keyboard or the open loading file
GetStrDisplays prompt and reads the next string until a carraige return string is stored in the address at the top of the stack when called
ClrScrClears the console window
Countreturns the count and the raw string of a counted string pointed to
CrPrints a newline character
ppprints top of stack using currednt base. Same as FORTH .
psShow the stack contents without disturbing the stack
pxprints top of stack, low byte as a formatted 2-digit hexadecimal number
pxxprints top of stack as a formatted 4-digit hexadecimal number
PrtStrprints the counted string pointed to by the top stack value
PutCharprints the lower byte of the top of the stack as an ASCII character
Spaceprints 1 space.
Spacesprints the number of spaces on the top of the stack
Type( s c -- ) prints the raw string at s for the number of characters in c
WordFetch the next word from the input stream. This follows the parsing rules of QUATRO so that if the input stream is "example[1]", four calls to Word will return: "example", then "[", then "1", and finally "]" each as a separate word.
IsNumber s -- TRUE n | FALSE Given a pointer to a string, try to convert it to a number. returns either TRUE and the value, or FALSE.
Console Window Control Words
WinSizew h -- (sets window size width and height)
WinTitles -- (set the window caption bar title with the string pointed to by s)
Defining Words
"Starts a string definition. Builds the string until a closing " is found. May only be used inside a colon definition.
:Begin a colon definition
;End a colon definition
Charturns whatever character follows it into an ascii value
CodeAddrGet the execution address of the word that follows in the input stream
GetCodeAddrGet the execution address of the dictionary entry pointed to
CreateCreates an empty dictionary entry. Used for colon defintions and for variables, constants, and arrays
CStuffCompile the byte on the stack into the dictionary entry being built
ExecuteExecutes the word pointed to by the top of the stack
FindSearches the dictionary for a word
ImmediateMarks the word just defined as an immediate word (See section on compilation mechanics)
NumberTries to convert the current word into a number
StuffCompile the word on the stack into the dictionary entry being built
//Ignore a comment from this point to the end of the line
/*Ignore a comment until a closing */ is found
Constants, Variables, and Arrays
ArrayDeclare an array of 16-bit words.
StringDeclare an array of 8-bit characters
ConstantCreates a name with a fixed constant value that cannot be changed. Constants are compiled as inline literal values, making them faster to execute than variables.
VariableAdd the name to the dictaionry and set the initial value of a variable
LocalCreates a local variable only valid within this colon definition. Local variable names never get into the dictionary since they only exist for that brief instant when the colon definition is begin compiled.
PadReturns the address of a temporary scratchpad memory block
[ .. ]Words used in subscripting an array
toSets the variable to-state switch to cause the next variable to be the destination of a value
locateSets the variable to-state switch to cause the next variable to return its address
toffResets the to-state switch to default value. (Should never really be necessary to use)
Memory Manipulation Words
CFetcha -- b (fetch the byte at a)
Fetcha -- n (fetch the word at a)
CStoren a -- (stores byte n at address a)
Storen a -- (stores word n at address a)
CMovef t c -- (move c bytes from addr f to add t)
ToStackPtrputs the top value on the stack into the stack pointer. e.g. 0 ToStackPtr clears the stack
StrCpyf t -- (Copies a counted string from f to t. Copies string count as well)
Control Block Words
BeginBegin ... (condition) Until Repeats the body of the block until (condition) is TRUE
BeginBegin (condition) While .... Repeat Repeats the body of the block while the condition remains TRUE
Dolim start Do .... Loop Repeats the loop with the index starting at start and continuing while the index is less than lim
Dolim start Do .... n PlusLoop Same as Do...Loop except that the value n is added to the index each time through the loop. If n is a negative number then the loop index starts high and counts downward.
iThe value of the current loop counter. Valid only inside a Do loop. If used elsewhere will return bogus values, or cause an R-stack underflow crash.
jThe value of the next outer Loop counter. Valid only inside the inner Do loop of a nested loop. If used elsewhere will return bogus values, or cause an R-stack underflow crash.
If Else Then(condition) IF ... Else ... Then The Else clause is optional. The word Then marks the end of the If clause. Everything after Then is executed regardless of the condition
LeaveExit the current looping structure. Exits immediately. No further processing takes place. Works inside of Do loops and Begin loops. Usually inside an If ... Then structure. If called in a nested loop, will only exit the deepest loop in which it is found.
Internal Words
Used by the compiler. As a general rule, don't mess with these.
_BNZBranch if non zero (used in compiling control structures).
_BRZBranch if zero (used in compiling control sturctures).
_FREERestore previous stack frame discarding all local variables. Only compiled into the code if local variables are actually used in the definition.
_JMP xxxxJump within a word definition.
_JSR xxxxCall a word in the dictionary.
_LIT xxxxPushes an inline literal value. Also used in compiling Constant
_LOOPControls a Do loop.
_LVAR xxxxControls fetching and storing a local variable.
_MARKEstablishes the stack frame context for local variables. Only compiled into the code if local variables are actually used in the definition.
_pLOOPControls a Do loop with a stepsize other than +1.
_RETExits from a word.
_VAR xxxxControls fetching and storing a global variable or array.



Coming up next: The source code and executable file for the language, plus more features and some sample programs to show the language in action.