1. Ask Me Anything: FAQs on Programs
This chapter assumes that the reader is completely new to programming of any sort. The very basic idea of computer programming is introduced by analogy and through a question-answer format. The next chapter then brings that analogy closer to reality.

    Meta-language: Language about Language

    These are difficult questions to answer precisely. An accurate answers would include jargon that someone asking the question would not understand anyway. So instead, let's ask a different question …

    There is no one answer to that either (this is not a good start for a FAQ!). But we could say that, in its written form, the English language consists of a set of allowed symbols and a set of rules for putting those symbols down in a sequence, one after the other.

    Some symbols like "." and "?" play special roles that others, such as "a" and "B" do not.
    Other symbols, like a space " " can be repeated without     changing the meaning.
    Certain symbols are not allowed in English, such as, "ø", and certain sequences of symbols are not allowed, such as "szzx qqbb".
    But English has an escape clause to deal with this: anything inside double quotes does not have to follow the rules of English and should be taken as is.
    Valid English with non-English symbols and words:
            In Norwegian "øst" means east, while in Martian "szzx" means west. 
    In the sentence the non-English components are put inside quotes. Of course, valid English can go inside quotes as well.

    It means that someone who already knows English will understand the sequence of symbols and agree it is proper English. Some rules that say which symbols are allowed next to each other are called spelling, and other rules are called syntax. So "zbignew" is not spelled as a proper English word, but Zbignew is perfectly fine because in English words starting with capitals are names, and names can be anything. They do not need to follow spelling rules. Finally, English syntax rules out "Eat dogs man." because the verb can't come first in ordinary sentences (but in some languages they do). However, English syntax rules in "Dogs eat man" but it means the dogs do the eating because the object comes after the subject and verb. It also rules in "Eat dogs, man!" because , and ! turn the sentence into an imperative statement to consume canines followed by an interjection. In English an imperative sentence does start with a verb.

    A computer programming language, including Ox is
    a set of symbols
    a set of rules that say what sequences of symbols are allowed.
    A program is a sequence of symbols that may or may not follow the rules of the language, just as a document may or may not follow the rules of English.
    As a computer language, the rules in Ox (its syntax) are very different from the syntax in a human language such as English.
    A key difference between a computer language like Ox and a human language like English is the scope they are designed to handle. A computer language is not designed to allow for the variety of documents in a human language, such as novels, poems, tweets, etc.
    Instead, Ox is a kind of computer language that is built to express a very narrow subset of proper English documents.
    A valid Ox program is a to-do list.

    In English a to-do list is written for a human (perhaps not the same person as the author) to understand then carry out the requests on the list.
    We assume this person wants to carry out the requests but can only do so if the requests make sense and are feasible.
    Let's call the person reading the to-do list the executor.
    The executor of an English to-do list will probably ...
    first skim the list all the way through, looking for problems without worrying about the details.
    Then, if that goes well, start carrying out the instructions one-by-one.
    The list may not make sense if it is not proper English or does not provide enough information to complete the request.
    The to-do list may have to include explanation or extra information beyond the requests alone. And the tasks on the list might conflict with each other in way that is only apparent as the executor is carrying them out.
    If such a problem arises the executor might have to stop and give up.

    A computer program is a to-do list (at least in the class of languages Ox belongs to), but it also includes auxiliary information needed to interpret the requests correctly.
    So for a computer program there must be an executor who will try to carry out the tasks the program requests.
    Later the notion of the executor will become much more explicit.
    For now, think of the Ox executor as an aspect of the computer equivalent to the reader of an English to-do-list.
    For brevity, the thing that acts like the executor of Ox will be referred to as ExOx: the name we are using for what executes an Ox program.
    What ExOx is exactly will become apparent as we go along.
    Just as a to-do list in English could be translated into an equivalent to-do list in Spanish (or Swahili or Navajo etc.), so nearly all computer languages are potentially equivalent.
    You could get a given computer to do the same tasks if written in Ox or C or FORTRAN or Perl or LISP or R or MATLAB etc.
    However, how efficiently a particular to-do list can be done, and how simple the to-do list is to write, can differ across computer languages. We discuss these differences in detail in a later chapter.

    In English there is not a single simplest sentence. Perhaps the shortest complete English sentence is "I am" unless we allow for imperatives like "Go!". However, in some important ways "I am" is no simpler than "She prevaricates" or the shortest verse in the Bible: "Jesus wept."

    A proper English sentence must have a subject and a predicate. These can be very complicated sequences of symbols or just two words, but grammatically they are equivalent despite differences in the number of letters or how abstract the verb is.

    The Simplest Ox Program
    main(){}
    That is a complete Ox program that tells ExOx to do absolutely nothing.
    Many other sequences of symbols would be valid Ox programs that do nothing, but some symbols could be removed from those programs and nothing would continue to happen.
    However, if any of the symbols in the program above were removed or modified, the program would fail to be understood by ExOx.
    Note: The Simplest Ox Program actually does printout a standard message, so a purist will take exception to Ox since a completely general language should be able to produce no output at all. The dispute is not with Ox as a computer language itself but with the ExOx interpreter of that language.

    Because a complete Ox program must define a procedure called main .
    A procedure is defined by its name, followed by information inside parentheses, ( ) , and then certain information inside curly brackets { } . Nothing is required to be inside those delimiters, so the simplest program is the one above.

    /* Do nothing */
    main(    )
      {
    
      }
    // done
    
    Yes and no.
    First, the symbols are obviously different, but as programs they both do exactly nothing. The second program includes two Ox comments.
    A comment is text in a program that is there to explain to a human (or possibly a different computer program) what the code is doing.
    By design, the executor of the program ignores comments completely. Comments in Ox are any symbols that come between /* and */ or come between // and the end of a line in the program.
    These ways to mark comments come from C, so they are similar to many other languages based on C but make no sense to someone just starting. Comments are an "escape clause" in a computer program. They say to the interpreter, ignore this material because the human put it here to communicate with other humans (or possibly other computer programs that look at this program).
    Further, any sequence of white space in Ox is equivalent: " " is the same as "\ \ \ \ ". And, with one exception, blank lines are the same as white space.

    The 01a-hello-world.ox program in the preface is a standard "do-something" program.
    Here is another one
    For goodness' sake, do something!
    main(){
      decl x;
      x = 20.3;
      }
    
    This program assigns the numerical value 20.3 to something called x.
    If you run this program you will see no more output than the do-nothing program above. However, it really does something and will be expanded on below.
    Do the same thing differently.
    main(){
      decl x = 20.3;
      }
    
     

     

    Understanding a Program: Symbols, Tokens, Parsing, Syntax

    It simply means to carry out the to-do list that the program is. This can be done by executing the program on a computer. Or it could mean doing it yourself mentally. How programs are executed is discussed later in the hardware sections.

    A computer program is written in a language that makes it easier for a human to write it, but the computer hardware will not understand the language directly.

    Let's return to English. Written English has special symbols, like the period "." to end a sentence. Sentences typically have no special symbol at the start. (Spanish has ¡ and ¿ to begin special sentences.) And typically a paragraph is one or more sentences that ends in a blank line or a new line that is indented.

    It would be confusing to use . for both the end of a statement in a computer language and as the decimal place in numbers. So in Ox (and many other languages), the special symbol ; ends a complete statement and acts like . in English. And like English a new statement begins after the last one ends so there is no need for a special start symbol.

    If we apply English syntax to that sentence it means we are parsing the sentence. Parsing does not result in the meaning of the sentence but rather interpreting the sequence of symbols. From a parsing perspective "Dogs" and "Cats" and "Martians" could all start that sentence and the syntax would be the same. So "Dogs" is a unit of the English language syntax, a word. In general, we would say it is a token.

    The token is a unit of the language that is defined by one or more symbols. The parser needs to find the tokens in the list of symbols, and in English this is done with spaces and punctuation. So in terms of tokens the sentence boils down to T1 T2 T3 T4, where T1 is "Dogs" and T4 is the final period. Notice that the spaces between words are not tokens. They are only there to mark the beginning and end of tokens. These markers are discarded during the parsing stage.

    A key thing is that a token has properties or attributes which the syntax and the semantics of the language must keep track of. One of these properties is the sequence of symbols that made it up so that later its meaning and other attributes can be looked up.

    Example of a Token and Its Attributes
    Token T1
       input: Dogs
       type: Noun, improper, plural
       role: Subject
        
    Notice that there are many attributes to the token. In human languages the same string of symbols can mean completely different things based on where they appear. For example, "dogs" can also be the verb meaning to chase relentlessly.

    The symbol { creates a block of statements, ended by } . A block of statements acts like a single statement. An analogy to English: a sentence expresses a complete thought. A paragraph is one or more sentences that also express a complete thought. It's just a more complex thought than a single sentence can convey. In English we put paragraphs in sections, sections in chapters, chapters in books, etc. But in most C-like languages such as Ox, blocks can go inside blocks. That is, { { } { } } is a block with two blocks inside it, like a book section consisting of two paragraphs.

    Spaces are important in Ox because, like in English, they separate sequences of symbols that produce separate chunks of meaning. For example, in English Dog eats man. and Dog eatsman. are completely different because one space is missing in the latter. In the same way, the program above would not be proper Ox if it read declx; instead of decl x; .

    On the other hand, spaces are not important everywhere in Ox, typically when special symbols are involved. main ( ) is the same as main() because the spaces are usually needed to separate things called identifiers, such as decl and x. Identifiers cannot have special symbols in them, so in main() the identifier is main and is followed by special symbols (). But declx; is a single identifier followed by ;, whereas decl x; are two separate identifiers next to each other followed by ;.

    In some languages like Python indentation (spaces at the start of the line) and new lines are important parts of the syntax, but not in Ox.

    Every Ox program has a single main() procedure.
    Without finding a main procedure, Ox code will not do anything.
    Some parts of Ox code "happens" when the program runs. Other parts of the code help set things up before the program runs.
    The first thing inside main() is the first thing that really happens when ExOx is carrying out the to-do list.
    Things that appear above main() in the program do not happen before it.
    A well-written Ox program includes comments to make it easier for a human to understand but are completely ignored by ExOx.
    Identifiers in Ox look like words, and like English words they are separated by spaces and special symbols.
    Because computer languages are supposed to be somewhat understandable by humans, special symbols often play roles in the language that are similar to their uses in human languages.

    A syntax error is a sequence of symbols that do not follow the Ox rules. An analogy to an English to-do list would be "The milk buy." A human might realize the list writer meant to write "Buy the milk." But typically program executors are not designed to guess, although they are designed to give helpful feedback if possible. That is, the human reader might say "I don't understand. Did you mean 'Buy the milk'?"

    Some of these syntax rules are easy to learn and intuitive, but others are harder to keep in mind. But even the simplest rules you already know are easy to forget when writing a program. Humans do not think like computers, and their languages are not the same as computer languages, so we often fail to follow basic syntax rules when writing programs

    main() {
      x = 5;
      }
    
    This will cause a syntax error. The reason in Ox is that a variable like x that you want to assign values to has to be declared before it is used. ExOx will look through the code and discover this problem before trying to carry out or execute the program. So, instead:
    main() {
      decl x;
      x = 5;
      }
    
    The term/token decl is special. For example, you cannot name a variable decl: you cannot write this: decl decl. Ox will not let you use decl in any other way. (Why not try it and see?)

    In computing languages special terms are called keywords or reserved words. They act like pronouns and punctuation do in English. For example, it would be very confusing if "the" meant "tea" in English, as in "Pass the the." English function words tend to be reserved. The list of keywords in Ox 7.0 is listed in the next section. You cannot name something in your Ox program one of these terms. If you do, your program will be rejected by Ox or it will be misinterpreted.

    01a-hello-world.ox
     1:    #include "oxstd.h"
     2:    main() {
     3:    	println("hello world ");
     4:    	}
    
    Recall our analogy to English: the executor of a to-do list will probably first scan the whole list before doing the items one by one. In the same way the executor of a programming language will go through your program more than once before actually executing it.

    Following the C programming language, the symbol # in Ox signals a "pre-compilation directive." What that means is that ExOx will process items that start with # on its first pass (or scan) of the program.

    These pre-processing directives are not requests to do something in the same sense that x = 5; asks that the number 5 be assigned to a variable named x. In hello world the #include directive tells Ox to find the file named oxstd.h and insert its contents here as if the programmer had typed them. This happens on the first pass through the program. On the second pass #include ... has been replaced by the contents of the included file.

    The use of #include and a similar more complicated #import directive is very useful for writing computer programs that are reliable and easy to read. In the case above, the effect is to tell ExOx to make available to this program all the standard routines in the Ox language. Because hello world refers to print() it must be declared before it is used in the program.

Exercises