hbar Arrays

hbar uses an internal data type called Arrays when it manipulates large quantities of numerical data. Camera images, optical path difference gradients, and encircled energies are all represented internally by Arrays. There exists a large library of functions used to create, display, transform, and save Arrays in very general ways. This section introduces terminology used when talking about Arrays, gives rules for representing Arrays as strings, and lays out their internal structure.

The Array data type should not be confused with Tcl arrays, which are similar to string dictionaries in which the user provides the key and Tcl looks up the value. hbar Arrays are multidimensional collections of numerical quantities stored in memory very efficiently.

Terminology

hbar Arrays are used to represent many different kinds of numerical data. Arrays are multidimensional collections of elements. Each element represents a quantity and all elements in an Array are of the same data type. The data type of the Array indicates the kind of numerical data that are stored in the Array. For the most part, the data types closely match the kinds of information that the computer can represent efficiently. The currently supported data types are listed in the table on the next page:

Description hbar abbreviation Components per element

signed 8-bit integer c 1
unsigned 8-bit integer uc 1
signed 16-bit integer s 1
unsigned 16-bit integer us 1
signed 32-bit integer i 1
unsigned 32-bit integer ui 1
single precision float f 1
double precision float d 1
single precision complex com 2
2-component vector v2 2
3-component vector v3 3
4-component vector v4 4
5-component vector v5 5
6-component vector v6 6

Some of the data types have more than one component per element, as noted. All components making up an element are treated as a unit when the Array is manipulated. Data types with one component per element are called scalar data types. Data types with more than one component per element are called vector data types. The hbar abbreviation listed in the table above is used whenever an atomic function requires a parameter that describes the data type of an Array. The atomic functions are a set of routines AOA has developed that augment standard Tcl and provide comprehensive Array manipulation capabilities. Section 7.2 describes the atomic functions in greater detail. The arguments to atomic functions are standard Tcl variables and strings and
obey standard Tcl syntax. Any Array can be bound to a Tcl variable so that the value
of the Tcl variable represents the Array. The binding process will be discussed in greater detail in section 7.1.3.

As was stated above, Arrays are multidimensional collections of elements. The number of dimensions that an Array has is called its rank. For example, the rank of a two-dimensional camera image is 2, whereas the rank of an Array describing a temperature as a function of time is 1. hbar supports Arrays that have rank 0, 1, 2, 3, and 4. Arrays with zero rank represent single quantities (scalars). An Array with rank 1 is a simple one-dimensional list of elements. The number of elements in a one-dimensional Array is also called the number of columns in the Array. The column indices start at zero and run through the number of columns minus one. A two-dimensional Array has rank 2 and has a number of columns and rows. When displayed as an image, the columns of a rank 2 Array are vertical stripes and the rows are horizontal. Every element in a rank 2 Array must have a value, and the total number of elements in the Array is the product of the number of columns and rows. Like the column indices, the row indices run from zero through the number of rows minus one. A rank 3 Array could be used to represent a physical quantity such as temperature or pressure as it varies throughout a volume of space. Rank 3 Arrays have a specific number of columns, rows, and planes. The total number of elements in a rank 3 Array is the product of the number of columns, rows,
and planes. Similarly, a rank 4 Array has columns, rows, planes, and slices.

An Array of any rank can be made up any data type listed in the table above. For example, each element of an Array with complex data type (com) is a complex number. Note that a rank 0 Array made up of a 6-component vector (v6) is not the same as an rank 1 Array with 6 elements. The 6-component vector is treated as a single element by the atomic functions, whereas each element in the rank 1 Array is individually accessible. Section 7.1.3 describes how to convert between different data types.

Representing Arrays as Strings

Properly formatted strings can be used to represent Arrays wherever an atomic function requires an Array as an argument. For example, there is an atomic function called a.add that can be used to add Arrays together. The addition proceeds element by element and produces an Array as a result. The following command line adds the elements of an Array known to Tcl as Data1 to the elements of an Array called Data2 and returns a string representing the result.

a.add $Data1 $Data2

Assigning Tcl variables to represent Arrays, or, in other words, binding a variable to an Array, will be discussed in section 7.1.3 If Data1 is a rank 0 Array of scalar data type, we can add the quantity 17 to the Array Data1 using the command

a.add $Data1 17

The string 17 is interpreted as a zero-dimensional scalar Array with the value 17 and is added to the value of Data1. Similarly, a way of calculating the sum of the numbers 19 and -23 is

a.add 19 -23

Both of the strings 19 and -23 are converted internally to Arrays before they are added by the a.add atomic function. One-dimensional arrays of scalar quantities are represented by lists of numbers within parentheses. For example, the command line

a.min "(-1 7.3 -4 9e-6)"

finds the minimum value within the 4-column rank 1 Array represented by the string
"(-1 7.3 -4 9e-6)". The double quotes are necessary in order for Tcl to delimit the string that represents the Array. Without the quotes, each of the strings

(-1 7.3 -4 9e-6)

would be passed separately to the a.min atomic function and would result in an error. Two-dimensional arrays are represented by nested one-dimensional arrays. For example, we can subtract two rank 2 Arrays, each with two columns and three rows, using the command

a.sub "((1 2)(3 4)(5 6))" "((1 1)(2 2)(3 3))"

The command would produce the resulting 2-column and 3-row Array

((0 1)(1 2)(2 3))

as expected. In a similar way, rank 3 Arrays are represented by nested rank 2 Arrays, and rank 4 Arrays are represented by nested rank 3 Arrays.

Data types other than scalar (more than one component per element) are represented using the paired characters < and >. As an example, we can add two 3-component vectors together using a command line like

a.add "<1 2.73 3>" "<-1 -2.73 -3>"

which would produce the single 3-component vector element

<0 0 0>

As a more challenging example, we can represent a 3-element rank 1 Array of vectors with two components using the string

"(<1 2> <3 4> <5 6>)"

Each element is delimited by the < > pair, and each element has two components. Generally, mixing the number components in strings representing vectors will result in
a syntax error. The following string does not represent a valid Array because the third element has two components, unlike the first two elements.

"(<1 2 3> <4 5 6> <7 8>)" # not valid

Complex numbers are represented by using a two-component vector and placing a lower case "i" after the second component. For example, a 3-column complex Array could be represented by the string

"(<-1 0i> < 1 1i> <2 -1i> <-2 1i>)"

All elements of a complex Array must have the "i" after the second component to indicate complex elements. The first component of each element of a complex Array is called the real part, and the second component is called the imaginary part. The square root of the sum of the squares of the real and imaginary parts is called the amplitude, and the angle, in radians, of the arc tangent of the imaginary part divided by the real part is called the argument. Because of the common use of Arrays with complex elements, there exist atomic functions to manipulate the components separately. See for example

a.real, a.imag, a.conj, a.amp, a.arg.

Creation, Conversion, and Promotion

This subsection introduces a number of atomic functions that are used to create Arrays and to convert between different data types. More information is available in section 7.2, which describes all atomic functions in greater detail.

The most commonly used atomic function for Array creation is a.flat. It creates a multidimensional Array of single precision floating point numbers (data type f) with all of the elements initialized to a specified value. For example, to create a 10-column,
20-row rank 2 Array initialized to the value -1 and assign it the name blam we could
use the command line

a.flat 10 20 -1 = blam

The = sign is necessary for the newly created Array to be saved in a variable that is accessible via Tcl and retained in memory. The process of associating a Tcl variable with an Array is called binding the Array to the variable. Without the = sign and the blam string the resulting Array would be printed in the interactive terminal window and deleted from memory. With the = sign and string, the Tcl variable blam is associated with the Array and the Array is not deleted from memory until the variable passes out of scope. To refer to the Array in another atomic function, the usual Tcl syntax $blam is used. The atomic function a.flat can be used to create rank 0 to rank 4 arrays. The following command line creates a rank 1 Array bound to the Tcl variable history with 100 elements initialized to zero.

a.flat 100 0 = history

An atomic function that is useful for printing out information about an Array is a.info. If the Tcl string blam is bound to a two-dimensional 10-column, 20-row Array of single precision floating point numbers, the command line

a.info $blam

will print out the following information

200 elements of type f (32 bit floating point), 800 bytes total data

2 dimensions

10 columns

20 rows

The first line of the output identifies the number and data type of the elements of the Array as well as the total number of memory bytes used by the internal representation of the Array. The next line indicates the rank of the Array, and each following line lists the extent of each dimension.

In order to create Arrays of any vector data type, the atomic function a.make is used. For example, suppose we wish to make a three-dimensional complex Array with 5 columns, 6 rows, and 2 planes. We also wish to initialize each element to complex zero. The command line that will create the Array is

a.make "<0 0i>" 5 6 2 = test

The first argument, "<0 0i>", establishes the data type and initial value of the output Array bound to the variable test. The next three arguments specify the number of dimensions and the extent of each dimension. a.make can be used to create Arrays with zero to four dimensions. A zero-dimensional Array can be created simply by specifying no columns, rows, or planes, as in the following command line.

a.make "<1 2 -3 0>" = fourVector

The command line creates a zero-dimensional Array with data type v4 and initial value "<1 2 -3 0>" and binds it to the Tcl variable fourVector.

The atomic function a.to is used to convert the elements of an Array from one data type to another. A scalar Array of any data type can be converted to any other data type by simply specifying the data type of the output. For example, the command line

a.to $in d = out

will convert any scalar Array in to double precision floating point and bind the resulting Array to the Tcl variable out. The dimensionality of the input Array is preserved in the output Array. For example, if the input was three-dimensional with 5 columns, 6 rows, and 2 planes, then the output will have exactly the same dimensions. Converting an Array with a data type capable of more precision to a data type with less precision will cause truncation of the result and a loss of precision. For example, if the following command line is executed


a.to
"(126 127 128 129 130)" c

the output to the terminal window will be

(126 127 -128 -127 -126)

The c data type, 8-bit signed integer, is not capable of representing values greater than or equal to 128 from the floating point literal Array "(126 127 128 129 130)". The input is truncated and wraps around to negative numbers to form the output. In general, truncation and wrap-around is not portable across machine architectures so conversion to data types with less precision should be performed with care.

The a.to atomic function can also be used to convert scalar Arrays to vector Arrays. For example, the command line

a.to "((1 2 3)(4 5 6))" v3

results in the output

(<1 2 3><4 5 6>)

The input elements are grouped together to produce each output element. If the output data type has N components, each consecutive set of N input elements is gathered together sequentially to produce each output element. The dimensional information
of the input Array is lost and a one-dimensional output Array is created.

Vector input Arrays can also be converted to scalar data type. For example,

a.to "(<1 2i><3 4i>)" f

sends the following output to the terminal.

(1 2 3 4)

Each component of each element of the input vector Array is sequentially converted
to single precision floating point and placed in the output Array. All dimensional information from the input Array is lost and a one-dimensional output Array is created.

For convenience, a.to can also be used to combine data from multiple scalar data type Arrays into single Arrays with vector data type. To create a complex Array from two scalar Arrays, we can use

a.to $realpart $imagpart com = cplx

The Arrays referred to by realpart and imagpart must have the same dimensionality and must be scalar. The output Array cplx will have complex data type and preserve the dimensionality of the input Arrays. Each element of cplx will have its real component taken from the corresponding element in realpart and have its imaginary component taken from the corresponding element in imagpart. A four-component vector Array can be created from four scalar Arrays using


a.to $c1 $c2 $c3 $c4 v4 = fourvector

All of the input Arrays must have the same dimensionality and be scalar. The output Array will have four components per element and each component will be taken from one of the input Arrays.

When performing arithmetic on Arrays with the atomic functions a.add, a.sub, a.mul, and a.div, the data types of the input Arrays are promoted so that the data type of the output Array has the most inclusive data type. For example, Arrays of data type c and uc are promoted to data type s before arithmetic operation if one of the input arguments to the arithmetic atomic functions is of data type s. The result of the operation will also be of type s. Similarly, Arrays of type c, uc, s, and us are promoted to type i if one of the arguments is of type i. All of the integer data types are promoted to f if necessary, and data type f is promoted to d if one of the arguments is double precision floating point. Finally, if any of the arguments is complex, then all of the others are promoted to complex. The vector data types are not generally convertible between one another, so no promotion occurs.

Internal Representation

Internally in computer memory, each Array is represented by a structure describing the format of the Array and a separate block of memory holding the data values. The structure describing the format of the Array specifies the data type, the rank, and the number of columns, rows, etc. The data for a particular Array is held in a single contiguous block of memory. Each element occupies space contiguous with the next element. For multidimensional Arrays, the consecutive column indices address consecutive elements. Row indices are vary next, followed by plane and slice indices. For example, a three-dimensional Array with two columns, three rows, and two planes represented by the string

(((1 2)(3 4)(5 6))((7 8)(9 10)(11 12)))

has the values 1 through 12 stored consecutively in memory. Arrays of data type c and uc are represented by 8-bit bytes, s and us are represented by pairs of bytes, i, ui, and f by four bytes, d by eight bytes, and complex by a pair of four-byte quantities. Each component of a vector Array is represented by a four byte floating point number. All components of a single element are contiguous, and each successive element is contiguous with the previous element. The atomic function a.shape can be used to change the dimensionality of an Array as long as the order of the Array elements in memory is not required to change. For example, the three-dimensional Array represented by the string above can be shaped into a one-dimensional Array using the command line

a.shape (((1 2)(3 4)(5 6))((7 8)(9 10)(11 12))) 12

The output of the command is a one-dimensional Array with the consecutive elements


(1 2 3 4 5 6 7 8 9 10 11 12)

Only the structure specifying the Array is modified, not the data block itself.