- Some symbols like "." and "?" play special roles that others, such as "a" and "B" do not.
- Other symbols, like a space " " can be repeated without changing the meaning.
- Certain symbols are not allowed in English, such as, "ø", and certain sequences of symbols are not allowed, such as "szzx qqbb".
- But English has an escape clause to deal with this: anything inside double quotes does not have to follow the rules of English and should be taken as is.
- Valid English with non-English symbols and words:
In Norwegian "øst" means east, while in Martian "szzx" means west.
- In the sentence the non-English components are put inside quotes. Of course, valid English can go inside quotes as well.
- A computer programming language, including Ox is
- a set of symbols
- a set of rules that say what sequences of symbols are allowed.
- A program is a sequence of symbols that may or may not follow the rules of the language, just as a document may or may not follow the rules of English.
- As a computer language, the rules in Ox (its syntax) are very different from the syntax in a human language such as English.
- A key difference between a computer language like Ox and a human language like English is the scope they are designed to handle. A computer language is not designed to allow for the variety of documents in a human language, such as novels, poems, tweets, etc.
- Instead, Ox is a kind of computer language that is built to express a very narrow subset of proper English documents.
A valid Ox program is a to-do list.
- In English a to-do list is written for a human (perhaps not the same person as the author) to understand then carry out the requests on the list.
- We assume this person wants to carry out the requests but can only do so if the requests make sense and are feasible.
- Let's call the person reading the to-do list the executor.
- The executor of an English to-do list will probably ...
- first skim the list all the way through, looking for problems without worrying about the details.
- Then, if that goes well, start carrying out the instructions one-by-one.
- The list may not make sense if it is not proper English or does not provide enough information to complete the request.
- The to-do list may have to include explanation or extra information beyond the requests alone. And the tasks on the list might conflict with each other in way that is only apparent as the executor is carrying them out.
- If such a problem arises the executor might have to stop and give up.
- A computer program is a to-do list (at least in the class of languages Ox belongs to), but it also includes auxiliary information needed to interpret the requests correctly.
- So for a computer program there must be an executor who will try to carry out the tasks the program requests.
- Later the notion of the executor will become much more explicit.
- For now, think of the Ox executor as an aspect of the computer equivalent to the reader of an English to-do-list.
- For brevity, the thing that acts like the executor of Ox will be referred to as
ExOx
: the name we are using for what executes an Ox program. - What ExOx is exactly will become apparent as we go along.
- Just as a to-do list in English could be translated into an equivalent to-do list in Spanish (or Swahili or Navajo etc.), so nearly all computer languages are potentially equivalent.
- You could get a given computer to do the same tasks if written in Ox or C or FORTRAN or Perl or LISP or R or MATLAB etc.
- However, how efficiently a particular to-do list can be done, and how simple the to-do list is to write, can differ across computer languages. We discuss these differences in detail in a later chapter.
- The Simplest Ox Program
main(){}
- That is a complete Ox program that tells
ExOx
to do absolutely nothing. - Many other sequences of symbols would be valid Ox programs that do nothing, but some symbols could be removed from those programs and nothing would continue to happen.
- However, if any of the symbols in the program above were removed or modified, the program would fail to be understood by
ExOx
. - Note: The Simplest Ox Program actually does printout a standard message, so a purist will take exception to Ox since a completely general language should be able to produce no output at all. The dispute is not with Ox as a computer language itself but with the
ExOx
interpreter of that language. - Because a complete Ox program must define a procedure called
main
. - A procedure is defined by its name, followed by information inside parentheses,
( )
, and then certain information inside curly brackets{ }
. Nothing is required to be inside those delimiters, so the simplest program is the one above. /* Do nothing */ main( ) { } // done
Yes and no.
- First, the symbols are obviously different, but as programs they both do exactly nothing. The second program includes two Ox comments.
- A comment is text in a program that is there to explain to a human (or possibly a different computer program) what the code is doing.
- By design, the executor of the program ignores comments completely. Comments in Ox are any symbols that come between
/*
and*/
or come between//
and the end of a line in the program. - These ways to mark comments come from C, so they are similar to many other languages based on C but make no sense to someone just starting. Comments are an "escape clause" in a computer program. They say to the interpreter, ignore this material because the human put it here to communicate with other humans (or possibly other computer programs that look at this program).
- Further, any sequence of white space in Ox is equivalent: " " is the same as "\ \ \ \ ". And, with one exception, blank lines are the same as white space.
- Here is another one
- For goodness' sake, do something!
main(){ decl x; x = 20.3; }
- This program assigns the numerical value 20.3 to something called
x
. - If you run this program you will see no more output than the do-nothing program above. However, it really does something and will be expanded on below.
- Do the same thing differently.
main(){ decl x = 20.3; }
- Example of a Token and Its Attributes
Token T1 input: Dogs type: Noun, improper, plural role: Subject
Notice that there are many attributes to the token. In human languages the same string of symbols can mean completely different things based on where they appear. For example, "dogs" can also be the verb meaning to chase relentlessly.
The symbol - Every Ox program has a single
main()
procedure. - Without finding a
main
procedure, Ox code will not do anything. - Some parts of Ox code "happens" when the program runs. Other parts of the code help set things up before the program runs.
- The first thing inside main() is the first thing that really happens when
ExOx
is carrying out the to-do list. - Things that appear above
main()
in the program do not happen before it. - A well-written Ox program includes comments to make it easier for a human to understand but are completely ignored by
ExOx
. - Identifiers in Ox look like words, and like English words they are separated by spaces and special symbols.
- Because computer languages are supposed to be somewhat understandable by humans, special symbols often play roles in the language that are similar to their uses in human languages.
main() { x = 5; }
This will cause a syntax error. The reason in Ox is that a variable like main() { decl x; x = 5; }
The term/token 01a-hello-world.ox 1: #include "oxstd.h" 2: main() { 3: println("hello world "); 4: }
Recall our analogy to English: the executor of a to-do list will probably first scan the whole list before doing the items one by one. In the same way the executor of a programming language will go through your program more than once before actually executing it.
Meta-language: Language about Language
,
and !
turn the sentence into an imperative statement to consume canines followed by an interjection. In English an imperative sentence does start with a verb.
A proper English sentence must have a subject and a predicate. These can be very complicated sequences of symbols or just two words, but grammatically they are equivalent despite differences in the number of letters or how abstract the verb is.
01a-hello-world.ox
program in the preface is a standard "do-something" program.
Understanding a Program: Symbols, Tokens, Parsing, Syntax
.
for both the end of a statement in a computer language and as the decimal place in numbers. So in Ox (and many other languages), the special symbol ;
ends a complete statement and acts like .
in English. And like English a new statement begins after the last one ends so there is no need for a special start symbol.
T1 T2 T3 T4
, where T1
is "Dogs" and T4
is the final period. Notice that the spaces between words are not tokens. They are only there to mark the beginning and end of tokens. These markers are discarded during the parsing stage.
A key thing is that a token has properties or attributes which the syntax and the semantics of the language must keep track of. One of these properties is the sequence of symbols that made it up so that later its meaning and other attributes can be looked up.
{
creates a block of statements, ended by }
. A block of statements acts like a single statement. An analogy to English: a sentence expresses a complete thought. A paragraph is one or more sentences that also express a complete thought. It's just a more complex thought than a single sentence can convey. In English we put paragraphs in sections, sections in chapters, chapters in books, etc. But in most C-like languages such as Ox, blocks can go inside blocks. That is, { { } { } }
is a block with two blocks inside it, like a book section consisting of two paragraphs.
Dog eats man.and
Dog eatsman.are completely different because one space is missing in the latter. In the same way, the program above would not be proper Ox if it read
declx;
instead of decl x;
.
On the other hand, spaces are not important everywhere in Ox, typically when special symbols are involved. main ( )
is the same as main()
because the spaces are usually needed to separate things called identifiers, such as decl
and x
. Identifiers cannot have special symbols in them, so in main()
the identifier is main
and is followed by special symbols ()
. But declx;
is a single identifier followed by ;
, whereas decl x;
are two separate identifiers next to each other followed by ;
.
In some languages like Python indentation (spaces at the start of the line) and new lines are important parts of the syntax, but not in Ox.
Some of these syntax rules are easy to learn and intuitive, but others are harder to keep in mind. But even the simplest rules you already know are easy to forget when writing a program. Humans do not think like computers, and their languages are not the same as computer languages, so we often fail to follow basic syntax rules when writing programs
x
that you want to assign values to has to be declared before it is used. ExOx
will look through the code and discover this problem before trying to carry out or execute the program. So, instead:
decl
is special. For example, you cannot name a variable decl
: you cannot write this: decl decl
. Ox will not let you use decl
in any other way. (Why not try it and see?)
Following the C programming language, the symbol #
in Ox signals a "pre-compilation directive." What that means is that ExOx
will process items that start with #
on its first pass (or scan) of the program.
These pre-processing directives are not requests to do something in the same sense that x = 5;
asks that the number 5 be assigned to a variable named x
. In hello world
the #include
directive tells Ox to find the file named oxstd.h
and insert its contents here as if the programmer had typed them. This happens on the first pass through the program. On the second pass #include ...
has been replaced by the contents of the included file.
The use of #include
and a similar more complicated #import
directive is very useful for writing computer programs that are reliable and easy to read. In the case above, the effect is to tell ExOx
to make available to this program all the standard routines in the Ox language. Because hello world
refers to print()
it must be declared before it is used in the program.