- Reading and Writing Matrices
You can get data in and out of files without creating and using an Ox file
. The simplest approach is to use loadmat()
and savemat()
. For example, suppose a file named mydata.mat
contains this information:
- mydata.mat
2 3
0.0 1.1 2.2
3.3 4.4 5.5
That is, the first line of the file has two integers, the number of rows and columns of the matrix. Then the elements of the matrix follow. Then your Ox program can read this data into a matrix.
14-load-save.ox
1: #include "oxstd.h"
2: main() {
3: decl m;
4: m = loadmat("mydata.mat");
5: savemat("mm.mat",m'*m);
6: }
This code segment creates a new file (or overwrites an existing file) named mm.mat
. It stores in the same format the matrix in the second argument, in this case the result of $m'*m$, which is a 3x3 symmetric matrix:
- mm.dat
3 3
1.0889999999999999e+001 1.4520000000000000e+001 1.8149999999999999e+001
1.4520000000000000e+001 2.0570000000000004e+001 2.6620000000000005e+001
1.8149999999999999e+001 2.6620000000000005e+001 3.5090000000000003e+001
The load and save matrix routines can deal with other kinds of files. These are four other data formats they can deal with that are most likely to be most useful to you:
- File Types that can be loaded
Extension Kind of File
---------------------------------------------------------------------------
.xlsx Excel 2007 (or newer) workbook file (Office Open xml)
.xls Excel worksheet or workbook file (binary file),
.csv comma-separated spread sheet file (text file),
.dta Stata data file (version 4--6 or 11)
Note that Stata regularly changes the format of .dta
files. The most recent version may write data in a new version that loadmat()
cannot handle. However, you can usually resolve this by saving the file in Stata using the saveold
command.
If you try to read in a spreadsheet with mixed data the loadmat()
function may not be able to handle it the way you want. However, the built-in Database
class in Ox can handle all kinds of data stored in one of the supported file formats.
- Creation and Construction
The size of a matrix, string or array can change as the program runs. This means that you can build up a matrix as the program runs. Your code does not have to create the matrix in its ultimate size. You can also reduce the size of a matrix, array or string. We focus on matrix and vector construction, but many of the ideas apply to arrays and strings.
We already discussed creating a matrix using syntax for handling "hard-coded" constants:
MATRIX ROW VECTOR COLUMN VECTOR
b = <1,2,3 ; 4,5,6> b = <1;2;3> c = <4,5,6>
Ox has a number of built-in functions that return matrices of various shapes and patterns. Here are the main ones:
Table 5. Matrix Creation Functions
STATEMENT TYPE CREATED VALUE CREATED
==================================================================
a = zeros(2,3) matrix 0 0 0
0 0 0
a = zeros(1,3) row matrix 0 0 0
a = zeros(2,1) column matrix 0
0
--------------------------------------------------------------------
b = unit(3) square matrix 1 0 0
0 1 0
0 0 1
b = diag(<1;2;3>) square matrix 1 0 0
0 2 0
0 0 3
--------------------------------------------------------------------
c = ones(3,2) matrix 1 1
1 1
1 1
c = ones(3,1) column vector 1
1
1
c = ones(1,2) row vector 1 1
---------------------------------------------------------------------
d=constant(6.5,2,3) matrix 6.5 6.5 6.5
6.5 6.5 6.5
c=constant(6.5,1,2) row vector 6.5 6.5
---------------------------------------------------------------------
e = range(0,5) row vector 0 1 2 3 4 5
e = range(0,-5) row vector 0 -1 -2 -3 -4 -5
e = range(-4,4,2) row vector -4 -2 0 2 4
You can create an empty matrix, one with zero rows and zero columns:
a = <>;
Why would you do this? One good reason is because later commands in your program will expand the empty matrix.
Suppose you have two variables, a
and b
. Both contain numbers that you want to put in a vector. Let's say a=2.5
and b=0.0 (although you might not know that while writing the code because the values depend on other input). You might think <a,b>
would work, but since "a" is not a hard integer like "2", you can't use that syntax. You will get a compile-time error. The reason is that the < … >
actually happens when Ox is compiling your program. It does not happen when your code is executing.
Instead, use the concatenation operators
. One way this happens is to explicitly add on elements, which is called concatenation or appending. There are two concatenation operators in Ox. ~
is the horizontal or column concatenation operator. It tacks on new columns (on the right) to a matrix or vector or array or string. |
is the vertical or row concatenation. It tacks on new rows (on the bottom). They have the same effect on strings and one-dimensional arrays because they do not have an orientation. The difference matters for matrices where rows versus columns matter. In addition, you apply concatenation to an integer or a double to produce a matrix for a scalar.
Table 6. Concatenation Operators
Oper Description Meaning Example Constant Equivalent
--------------------------------------------------------------------------------
~ horizontal add columns d = 2.5 ~ 0.0; d = <2.5 , 0.0>;
| vertical add rows d = 2.5 | 0.0; d = <2.5 ; 0.0>;
~= horizontal add columns d = ones(2,2);
self-concat. to myself d ~= 2.5; d = < 1, 1, 2.5 ; 1, 1, 2.5>;
|= vertical add rows d = zeros(2,1);
self-concat. to myself d |= 2.5; d = <0 ; 0 ; 2.5>;
What can be useful but confusing is that Ox will let you concatenate things that do not have the right shape. For example:
LEFT OP RIGHT =
---------------------------------------
<0, 1, 2> ~ <4 ; 5> = 0 1 2 4
0 0 0 5
<1, 2> | <5 ; 6 ; 7> = 1 2 0 0 0
1 2 5 6 7
Adding a bigger column vector on the right of a vector means rows of zeros are used as padding. Adding a bigger row vector to the bottom means columns of zeros are used as padding.
Exercises
- The program below reads in a spreadsheet also in the code folder which contains one year of temperature data from U. of Waterloo. Notice this shows that Ox's
loadmat()
function can read in spreadsheets and some other kinds of files into an Ox matrix.
15-temperature.ox
1: /* Exercise code by C. Ferrall. See notes for more explanation.
2: */
3: #include "oxstd.h"
4:
5: enum{ day, lo, hi, precip,Ncolumns}
6: const decl
7: clabels={"day#","low","hi","precip"},
8:
9: fn ="Daily_summary_2014.xlsx"; //Source: http://weather.uwaterloo.ca/data.html
10:
11: main() {
12: decl data = loadmat(fn);
13: if (isint(data)) oxrunerror("spreadsheet file not in the same folder as program.");
14:
15: data[][day] += dayofcalendar(1900,1,1); //Adjust dates to Ox's Julian dating.
16:
17: println("First 10 days:","%c",clabels,"%cf",{"%C","%10.1f"}, data[:9][] );
18: println("Max Hi temperature:",maxc(data[][hi]));
19: println("Min Lo temperature:",minc(data[][lo]));
20:
21: // add code here for more statistics
22:
23: }
An aside
The first column of the spreadsheet (and now matrix in Ox) contains an integer value for the day. The data counts days from January 1, 1900. Ox can print out calendar days too, but it uses a different date for day 0. So the program adds the day number of 1-Jan-1990 to the data. Then Ox will print out dates if you use the %C
format. None of this is important for the exercise, but it is included to show that your input and output can be sophisticated.
The program prints out the first 10 days of data. It uses maxc()
and minc()
to find the highest and lowest temperatures that year.
Add code using if()
, for()
and/or built-in Ox functions to compute the following statistics from the data. (I suggest using both built-in functions and your own loops ... compute the same statistic both ways to verify your code).
- The average daily temperature range over the year (difference between high and low temperature each day). A formula for this would be $\sum_{d=1}^{365} {hi_d - lo_d \over 365}$ but you can compute that number different ways.
- The number of days the hi temperature was below 0.
- The number of days the low temperature was above 20.
- The biggest change (in absolute value) in mean temperature from one day to the next over the year.
- The autocorrelation coefficient for average daily temperature (the correlation between today's temperature and yesterday's temperature).
- Extra Credit: prompt the user to first enter either C or F. If they enter "F" then convert the temperature data to Fahrenheit and print out results in Fahrenheit instead of Celsius.
- Fancy Indexing
Earlier we discussed the basic way to get to elements of a matrix or an array. A[0]
is the first item in the array or vector. A[i][j]
is the (i+1)th
, (j+1)th
element of a matrix or a two-dimensional array. Matrices can only have two dimensions but arrays can have any number of dimensions, although some thought should be put into why multiple dimensions are used.
Ox syntax can far beyond simple scalar indexing. Suppose v
is a vector with 100 elements, so the indices go from 0 to 99. Now suppose we want to pull out and use the 0, 5, 17 and 53 elements. One way is to use scalar indexing with concatenation:
z = v[0] | v[5] | v[17] | v[53];
println("selected elements of z= ",z);
This is inflexible since the indices of interest are hard-coded in the program. The indices we want to extract cannot be dynamically determined. But they can be if they themselves appear in a vector:
ind = 0 | 5 | 17 | 53;
z = v[ind];
println("selected elements of z= ",z);
That is, if you use a vector (or matrix) to index a vector (or matrix) in Ox you will be pulling out selected components to form a new vector or matrix. In the example above, ind
is a hard-coded vector, but it could be computed dynamically.
Another way to get dynamic selection of rows and columns of a matrix is to use a range. A range index is the form l : u where l and u are non-negative integers and l ≤ u. However, u cannot be larger than the biggest index of the vector/matrix/array selected from, otherwise an error is created. You can also not include either limit, which will then select the ends of the vector/array. So :u is the same as 0:u and l: is the same as l:«length»-1, where «length» is the length of the dimension being indexed (either the number of rows or columns depending on the orientation).
Here are some examples with some explanation. First, let's define some things that appear in the examples:
- LET
j , k
be non-negative integers, j ≤ k
iv
be a n×1 vector of indices (non-negative numbers which Ox will round to integers for you)
iy
be a m×1 vector of indices
im
be a r×s matrix of indices.
A
be a matrix with N rows and M columns (so the indexing below does not produce an error)
Expression : Result
------------------------------------------------------------------------------------------
A[j][iv] : 1×n vector of items from the j-row of A
A[j][iv'] : Same result; transposing the index vector selects same elements
A[iy][j] : m×1 vector of items from the j-column of A
A[j:k][] : a matrix made up of all columns of A between row j and k
A[:j][] : same as [0:j], 0 is implied
A[k:][] : same as [k:N-1], the last index is implied.
A[im][j] : same as if the rows of im were lined up into a 1× (rs) row vector.
Since A
is already a matrix that exists, the selected values can be placed on the left side of an assignment! For example:
A[iv][3] = 2; Set selected rows of A's 3rd column to 2.
A[1][iy] = x; Insert a vector x into selected columns of A's first row.
Table 7. Matrix Reshaping Functions
Function Description Inverse Operation
-----------------------------------------------------------------------
shape(A,r,c) Create a r x c matrix from
stacked columns of A
reshape(A,r,c) Create r x c matrix from
rows of A
vec(A) stack columns of A A = shape(vec(A),r,c)
on top of each other
vecr(A) line up rows of A A = reshape(vecr(A),r,c)
one after the other
Dot
and Other Matrix Operators
In basic matrix algebra $AB$ is matrix multiplication as defined above which involves multiplying and adding rows and columns. In Ox, A*B
is equivalent to matrix $AB$. Meanwhile $A+B$ is defined as the simple element-by-element addition of the two matrices, and in Ox A + B
does the same thing. There is typically no need to define element-by-element multiplication of two matrices in symbolic algebra.
However, in computing it is often very convenient to apply scalar operators to matrices. Ox, like some other Matrix languages, puts a period in front of operators to indicate that they are element-by-element. When an operator is applied to two matrices with the same dimensions, the Ox syntax references shows you what to expect. That is,
A B C
matrix .op matrix = matrix = aij op bij
m x n m x n m x n
In other words, A .* B
is defined as the matrix $C$, where $c_{ij} = a_{ij}\times b_{ij}.$
0 1 .* 1 -1 = 0 -1
2 3 -1 1 -2 3
Note that +
and -
are already dot-operators, so you don't need to write .+
. Dot operators can produce runtime errors. A .* B
is not defined if A is 2×3 and B is 2×4. Just as in other cases, there has to be a matching of elements for the operation to work. However, if one of the matrices is a vector the dot-operators will not produce an error.
Putting a column vector on the right side of a dot-operator is the same as if you had used a matrix with $n$ copies of the column.
A y C
matrix .op matrix = matrix = a_ij op yi
m x n m x 1 m x n
0 1 2 .* 1 = 0 1 2
2 3 2 -1 -2 -3 -2
You will get an error if the number of rows of $y$ are not the same as $A$. You can post-multiply by a row vector too, but then its columns has to equal the left argument.
A y C
matrix .op matrix = matrix = a_ij op yj
m x n n x 1 m x n
0 1 2 .* 1 -1 0 = 0 -1 0
2 3 2 2 -3 0
- THIS CAN BE CONFUSING: it matters whether you multiply by a row or column vector.
Further, dot operators on two vectors can produce a matrix or a vector. If the two vectors have the same shape then they must have the same length and the result is the same shape. If one vector is a row and the other a column a matrix is created:
y .op z = C = yi op zj
m x 1 1 x n m x n
y .op z = C = yj op zi
1 x n m x 1 m x n
So the result gets its row size from the row vector (whether it is the left or right argument) and its
column size from the column vector.
This table explains what happens when you use a dot operator on different kinds of variables. The symbol • is a stand-in for the different operators that have a dot version. That is, what you see is r = a .• b
. The table shows you the type and dimension (if relevant) of the left-side argument (a), the type and dimension of the right-side (b) and the resulting type and dimension (r). For example, the first example says that if boh a and b are integers then a.*b
returns an integer.
Table 8. Results of Dot Operators
left (a) op right (b) = r
type dim • type dim type dim scalar value
----------------------------------------------------------------------------------
int .• int int a • b
int/double .• double double a • b
double .• int/double double a • b
scalar .• matrix m x n matrix m x n a • b_{ij}
matrix m x n • scalar matrix m x n a_{ij} • b
matrix m x n .• matrix m x n matrix m x n a_{ij} • b_{ij}
matrix m x n .• vector m x 1 matrix m x n a_{ij} • b_{i0}
matrix m x n .• vector 1 x n matrix m x n a_{ij} • b_{0j}
vector m x 1 .• matrix m x n matrix m x n a_{i0} • b_{ij}
vector 1 x n .• matrix m x n matrix m x n a_{0j} • b_{ij}
vector m x 1 .• vector 1 x n matrix m x n a_{i0} • b_{0j}
matrix 1 x n .• vector m x 1 matrix m x n a_{0j} • b_{i0}
string n .• string n vector 1 x n a_{j} • b_{j}
string n .• int vector 1 x n a_{j} • i
int .• string n vector 1 x n i • b_{j}
Other mathematical functions will work on matrices. exp(A)
is the matrix of exponents. That is, its ij
element equals $\exp(a_{ij})$.
Table 9. Some Useful Matrix Functions
Function Does This
------------------------------------------------------------------------
maxc() find the maximum value in each column of a matrix as
a vector of values
minc() find the minimum value in each column
maxcindex() return the row index of the maximum element
any() returns TRUE if any of the elements of a matrix is not 0.
FALSE if all the elements are 0.
- Inversion
Division is not usually thought of as a matrix operation, but multiplying by the inverse of a matrix is like dividing by it. For example, if $a$ and $b$ are scalar numbers then $a \div b = a \times b^{-1}$. Ox allows you to use that same analogy in your expressions:
- Ways to invert
decl A, B, C;
A = <2, 1; -1, 2>;
B = <1, 1; -2, 2>;
//EQUIVALENT STATEMENTS IN OX:
C = A / B; C = A * invert(B); C = A * B.^(-1);
We discuss how inverse matrices are computed numerically later.
- Self-Assignment
By now you need to be comfortable with the assignment operator, as in
x = y;
will assign the value of y
to x
. (Some extra discussion about what happens when y
is an object of class, an advanced topic discussed later.)
Many lines of code involve doing something to a variable and storing the result back in the same variable such as:
x = x + y;
In other words add y
to x
and then store the result back in x
. The self-assignment operators do this without having to put x
on the right side.
Table 10. Results of Self-Assignment Operators
OPER DOES WHAT? EQUIVALENT WITH PLAIN =
---------------------------------------------------------------------------------
*= self multiply then assign x = x * y;
.*= self element-by-element multiply then assign x = x .* y;
/= self divide then assign x = x / y;
./= self element-by-element divide then assign x = x ./ y;
+= self add then assign x = x + y;
-= self subtract then assign x = x - y;
Notice that there is no .+=
operator and there is no .+
operator. That's because matrix addition is already an element-by-element operation. And there is no obvious analog to ordinary matrix multiplication that would go along with a different kind of addition.
In ordinary matrix terminology, there is no matrix divide operator. But just as $3/2$ is equivalent to $3\times 2^{-1}$, so too does Ox let you use /
and /=
to carry out post-multiplying by the inverse of the matrix after the operator. The element-by-element version simply divides individual element of x
by y
as with operators already discussed.