PASCAL Programming: Abstract Data Structures and Files

PASCAL Programming: § 5: Abstract Data Structures and Files

Instructor: M.S. Schmalz

Computer program design can be made much easier by organizing information into abstract data structures (ADS). For example, one can model a table that has three columns and an indeterminate number of rows, in terms of an array with two dimensions: (1) a large number of rows, and (2) three columns. A key feature of modern computer programs is the ability to manipulate ADS using procedures or methods that are predefined by the programmer or software designer. This requires that data structures be specified carefully, with forethought, and in detail. This section is organized as follows:

5.1.

5.2.

5.3.

In this introductory class, we will concentrate only on the data structures called arrays, which are discussed in Section 5.1. We then progress to a discussion of how PASCAL handles information stored in DOS files (Section 5.2), and review in detail the PASCAL commands for file input/output (I/O, Section 5.3).The section on file I/O will be updated periodically to reflect updates in Turbo PASCAL syntax and usage.

5.1. Arrays and their Manipulation.

We begin with several observations about the use of arrays in computer programs.

Observation. In the early days of computer programming, machines were dedicated to the task of computing tables of artillery trajectories (WWII) and tables of accounting and business inventory information (early 1950s). Thus, numerically intensive computing machines were designed to handle linear or two-dimensional arrays. When computer programming became better established and scientific applications came into vogue, the FORTRAN (FORmula TRANslation) language was developed which supported multiply-dimensioned arrays.
Definition. An array is a data structured whose domain is a finite subset of Euclidean n-space Rⁿ.
Observation. Arrays in PASCAL are assigned the datatype of the elements that they contain, which can bee one and only one datatype. For example, arrays can be integer-, real-, string-, or character-valued, but elements of more than one such type cannot be contained in a PASCAL array.
Example. The vector a = (2,3,-1,5,6,0,9,-7) is a one-dimensional integer-valued array. The first value, denoted by a(1), equals 2. The i-th value in the array is denoted by a(i), i = 1..8, because there are eight elements in the array.
Example. The two-dimensional array shown below has four columns and three rows. Each element of the array is referenced by its (row,column) coordinate. For example, the element whose value equals 9.2 is in row 3, column 4, which we write as a(3,4) = 9.2 .

Figure 5.1. An example of a two-dimensional array.
Remark. The use of row-column indices or coordinates makes referencing elements of arrays convenient. It is especially useful to note that arrays can be indexed in loops. For example, a loop that would set all the elements of the array a in Figure 5.1 to zero could be written in pseudocode as:
```
         :
       DECLARE a : array [3,4] of real;
         :
       FOR i = 1 to 3 DO:
         FOR j = 1 to 4 DO:
	   a(i,j) := 0.0
	 ENDFOR
       ENDFOR
	   
```
and in PASCAL as:
```
         :
       VAR a : array [1..3,1..4] of real;
         :
       FOR i := 1 to 3 DO
         FOR j := 1 to 4 DO
	   a[i,j] := 0.0 ; ;
	   
```
Observation. In the above PASCAL code fragment, each dimension of the array a has a lower and upper limit to the subscripts that are allowed. One finds the size of each dimension by subtracting the lower limit from the upper limit, then adding one.
Example. If an array is dimensioned as:
```
     VAR b : array [-2..3,5..9,4..8] of integer; 
```
then the number of elements in the array is computed as follows:
```
     STEP 1: Dimension #1 is of size (3 - -2 + 1) = 6
     STEP 2: Dimension #2 is of size (9 - 5 + 1) = 5
     STEP 3: Dimension #3 is of size (8 - 4 + 1) = 5
     STEP 4: Multiply the dimension sizes:  
            
                      N_elements = 6 x 5 x 5 = 150
       
```
to find that there are 150 elements in the array.
In the early days of computing, it was very important to know how large arrays were, because computer memory was extremely limited. Today, with large memory models, one still must be careful not to specify array dimensions too large, but it is less of a problem than in the past.
Programming Hint: As we mentioned in class, one always initializes program variables to which file data is not assigned prior to computing a given expression. One can use the preceding loop structure to assign initial values to array elements in an efficient manner.

A key problem with arrays is that they have fixed size. Hence, they are called static data structures. In a more advanced class, we would examine techniques for programming data structures called lists, which can expand and contract with the data that one puts into or takes out of the list. Another problem of arrays which we mentioned previously is that they are statically typed, i.e., cannot be used to store any type of data, but only the type of data assigned to them in the VAR statement in which they were declared. It is interesting to note that certain languages, such as SNOBOL and ICON, have circumvented this difficulty by providing a TABLE data structure that can hold data of any type. You are not responsible for knowing about the TABLE data structure, however, since it is not available in PASCAL.

5.2. DOS Files and Turbo PASCAL.

Turbo PASCAL supports many different types of file operations through its DOS interface, which is transparent to the user. In Section 2, we mentioned file operations such as open, read, write, and close, which we now discuss in some detail.

In order to understand file structures, it helps to think of a disk drive in terms of a drawer in a filing cabinet. In each drawer, there are many files, which are usually contained in manila folders. In order to view, create, or modify the contents of a folder, one must first retrieve then open the folder. This is similar to initializing and opening a computer disk file.

If one wants to view the file contents, then one must read the file, which holds for either physical or computer files. Similarly, creating new file contents or modifying existing file information is accomplished by writing to the file.

After one has completed operations on a given file, it is returned to the file cabinet, to keep the work area neat (this helps one find the file when it is next needed). A similar situation holds for computer disk files, where one closes the file in order to deallocate file pointers assigned by the file I/O library and runtime module.

Additional file operations, such as changing file read/write permissions are also possible. However, these are within the purview of more advanced topics, and are not part of the basic file operations reviewed in these class notes.

5.3. PASCAL File I/O Commands.

The PASCAL language provides constructs for allocating or initializing, opening, reading, writing, and closing files. File addresses or references are expressed in terms of symbolic file handles, which are represented in PASCAL as names assigned to a given file. The following commands pertain:

ASSIGN statement:

Purpose: The ASSIGN statement provides a mechanism for linking a disk file to a symbolic name or file handle.

Syntax: ASSIGN( file-handle , file-pathname ) ;, where

file-handle

text

file-pathname is a DOS pathname of the file to be referenced by file-handle

Example:

VAR handle : text ;
    :
ASSIGN(handle,'A:/PROJECTS/PROJ-2.DAT');
    :

Notes: In the preceding example, the string constant 'A:/PROJECTS/PROJ-2.DAT' may be replaced by a string variable that contains the pathname. Also, the file referenced by this path must be a text file, as noted in the VAR statement that precedes the ASSIGN statement.

There are two types of file opening statements, one of which opens the file for reading (input), the other for writing (output).

RESET statement:

Purpose: The RESET statement opens a file for reading.

Syntax: RESET (file-handle) ; , where

file-handle

Example:

 RESET (handle);

REWRITE statement:

Purpose: The REWRITE statement opens a file for writing.

Syntax: REWRITE (file-handle) ; , where

file-handle

Example:

 REWRITE (handle);

PASCAL has two ways of reading or writing to a file. The READLN (or WRITELN) statement reads a sequence of characters terminated by a newline character or a carriage return, whereas the READ (WRITE) statement reads the file as a stream of characters. Since we have already covered the READLN and WRITELN statements in class, which we used to obtain input from (send output to) the computer keyboard (monitor), we herein discuss the READ and WRITE statements only. The syntax of READLN and WRITELN is symmetric to that of the READ and WRITE statements.

READ statement:

Purpose: The READ statement inputs data from the keyboard or a file as a stream of characters.

Syntax: READ ([file-handle], [I/O-list]) ; , where

file-handle

I/O-list is a list of variables with optional format specifiers that reference data contained in the file that is itself referenced by file-handle.

Example:

 READ (handle, a[1], letter, x, y);

Notes: If the file handle is omitted, then the Turbo PASCAL runtime module understands that the input is being taken from the computer keyboard. For those who may be C programmers or have worked with UNIX systems (versus DOS), this is called "stdin", which is an abbreviation for "standard input".

WRITE statement:

Purpose: The WRITE statement inputs data from the keyboard or a file as a stream of characters.

Syntax: WRITE ([file-handle], [I/O-list]) ; , where

file-handle

I/O-list is a list of variables with optional format specifiers that reference data contained in the file that is itself referenced by file-handle.

Example:

 WRITE (handle, a[1], letter, x, y);

Notes: If the file handle is omitted, then the Turbo PASCAL runtime module understands that the output is to be directed toward the computer monitor. For those who may be C programmers or have worked with UNIX systems (versus DOS), this is called "stdin", which is an abbreviation for "standard input".

CLOSE statement:

Purpose: The CLOSE statement deallocates the file handle that was activated with the ASSIGN statement, and closes the file on disk.

Syntax: CLOSE ([file-handle]) ; , where

file-handle

Example:

 CLOSE (handle);

Notes: The CLOSE statement has a delete option that we do not recommend for introductory users. Unfortunately, this option does not have any "safety checks", and it is easy to delete data or programs (if you make a filename entry error) that you have spent much time on.

This concludes our summary of PASCAL file commands. Details of command usage, options, formatting, etc. will be given in class.