DATA

1. DATA ATTRIBUTES

DATA - "that which is given". A data item has a number of attributes:

An Address.
A Value which in turn has:
- An Internal Representation comprising one or more bits and
- An interpretation of that representation.
A set of Operations that may be performed on it.
A Name to tie the above together (also referred to as a label or identifier).

Address

The address or reference of a data item is the physical location in memory (computer store) of that data item.
The amount of memory required by a particular data item is dictated by its type, e.g. integer, character, etc.

Value

The binary number stored at an address associated with a data item is its value.
The required interpretation is dictated by the type of the data item.
Consider 01011010, this can be interpreted as:
- The Decimal integer 90
- The Hexadecimal integer 5A
- The Octal integer 132
- The ASCII character (capital) Z
- Etc.
The set of symbols used to express the value of a given data item is sometimes referred to as a literal.

Operations

The type of a data item also dictates the operations that can be performed on it. For example numeric data types can have the standard mathematical operations performed on them while string data types cannot.

Names

To do anything useful with a data item all the above must be tied together by giving each data item an identifying name (sometimes referred to as a label or an identifier).

Names are usually made up of alpha-numeric characters.
White space is usually not allowed (an exception is Algol 60).
Some languages (Ada and C included) allow underscores.
Some languages restrict the number of characters (first 31 in ANSII C).
Some allow any length but treat the first N characters as significant.
Case may be significant: in Algol'60, C and Modula 2 case matters; in Pascal and Ada it does not.
Cannot use key/reserved words as names.

It is also useful to adopt some sort of naming convention, for example ending all constant data item names with "const".

Summary

Data Items comprise: (1) a name, (2) an address and (3) a value (Figure 1). Knowledge of the data item name allows access to the address which in turn allows access to the value. It is important to distinguish between the different components of a data item.

Figure 1: Components of a data item (notation after Baron 1977)

The type of a data item dictates:

The range of possible values.
The amount of storage required for its internal representation.
The interpretation of the internal representation.
The operations that can be performed on the item.

Data items can be classified as being either global or local data items, and as being either variables or constants. The nature of a data item is defined through a data declaration.

Bal and Grune (1994) summaris the inter-relationship between the different data item attributes using a diagram of the form presented in Figure 2. The components of the diagram are self explanatory.

Figure 2: The inter-relationship between data item attributes (Bal and Grune 1994)

2. DECLARATION, ASSIGNMENT AND INITIALISATION

Data declaration

A data declaration introduces a data item. This is typically achieved by associating the item (identified by a name) with a data type. Examples in Ada, Pascal and C:

NUMBER: integer;     number: integer;     int number;

Here we have declared a data item called number as an integer. Note that by convention, in Ada and Pascal (which are not case sensative), we distinguish between reserved words and user defined labels by writing one or the other using upper-case characters.

In some Imperative languages we can specify the possible values for a data item, in which case the compiler may deduce the type of a data item according to the nature of these values. Examples in Pascal:

number: 1..10;               letter: 'A'..'C';

Here the first item, number, will be considered to be an integer (because the range of possible values for the item are integers); and the second item, letter, will be considered to be a character (because the range of possible values for the item are characters).

Assignment

Assignment is the process of associating a value with a data item. Usually achieved using an assignment operator. Examples in Ada, Pascal and C:

NUMBER := 2;          number := 2;          number = 2;

Here we have assigned the value 2 to a data item called "number". Note that the assignment operator in Ada and Pascal is := and in C is =. This is a common distinction between the Algol and C style of programming lanaguage.

Initialisation

Initialisation is then the process of assigning an initial value to a data item. Some imperative languages allow this to be done on declaration (e.g. C and Ada) others do not (e.g. Pascal). Examples in Ada and C:

NUMBER: integer:= 2;                         int number = 2;

(In Pascal you must first declare a data item and then assign a value to it, conserquently the concept of inialisation does not exist in Pascal).

Ordering of declarations

In some imperative languages (notably Pascal) declarations must be presented in a certain order, e.g. constants before variables. In Pascal constants (see below) are grouped together in a CONST section, followed by variables grouped together in a VAR section.

Multiple Assignment

Many imperative langauge (Ada and Pascal, but not C) support the concept of multiple assignment where a number of variable can be assigned a value (or values) using a single assignment statement. Examples in Ada:

NUMBER1, NUMBER2 := 2;

In Pascal we do not have to assign the same value to each varaibale when using a multiple assignment statement:

number1, number2 := 2, 4;

In Pascal we can also write:

number1, number2 := number2, number1;

which has the effect of "doing a swap".

Positioning of declaration statements

In languages such as C, Ada and pascal data declarations are expected to be found at the start of a procedure or function definition (with the exception of C global data items --- see below). Some languages allow declarations to occur anywhere with in procedure or function definition (for example the object oriented languages Java and C++). What ever the case remember that:

A data item cannot be used until it is declared.
Good software engineering practice dictates that declarations should be arranged in some sensible nas systematic manner, for example immediately before the first time they are used.

3. GLOBAL AND LOCAL DATA

Data items have associated with them:

A life time - the period during the running of a program when they can be said to exist.
A visibility - the parts of the program from where they can be accessed ("seen").

The nature of the life time and visibility of a data item is dictated by what are referred to as scoping rules. In this respect there are two types of data:

Global data, which has a life time equivalent to an entire program and is visible from anywhere within that programme.
Local data, whose life time and visibility is in some way limited.

4. VARIABLES"

Given that we can always replace a particular bit pattern with another, the value of a data item can always be changed.
Imperative languages support such change.
Data items that are intended to be used in this way are referred to as variables.
The value associated with a variable can be changed using an assignment operation.
In most imperative languages (including Ada and C) data items are assumed to be variables by default (in Pascal we must include the key word VAR in the declaration section --- example given below).

Uninitialised Variables

It is possible to declare a data item that has no value (other than an arbitrary bit pattern).
Such a data item is referred to as an uninitialised variable.
Uninitialised variables are dangerous!
Individual imperative languages treat the phenomena of uninitialised variables differently. Common approaches include:
1. Ignore the problem and leave it up to the programmer not to use them.
2. Consider their usage to represent an error.
3. Allocate an appropriate value to such variables.
Most imperative languages either ignore the problem completely (Modula II), or at least to a large extent (C and Ada).
- Ada initialises pointers to NULL and otherwise ignores the problem.
- C initialises global variables to some appropriate "zero" value and otherwise ignores the problem.

5. CONSTANTS

It is sometimes desirable to define a data item whose value cannot be changed. Such data items are referred to as constants and are usually indicated by incorporating a predefined key word into the declaration. Examples in Ada and C:

LABEL: constant:= 2;                const int label = 2;

Note that in Ada we do not declare the data type of the constanr because this can be deduced from its value. In Pascal we declare constants by grouping then together in a const section:

const
	label = 2;
	pi    = 3.14159;
VAR	
	number integer;

We can think of a constant as a data item comprising only of a name and a value (Figure 3), i.e. no address.

Figure 3: A constant data item

However it should be appreciated that what we are doing here is telling the compiler to "flag" the data item as a constant (i.e. we are only instigating software protection). Theoretically we can still change the bit pattern representing the value.

6. EXAMPLE PROGRAMS

At this stage in the discussion it is appropriate to present a number of example prgrams that feature different kinds of data. The first, Table 1, is a C program that has two global data items, a variable GLOBAL and a constant GLOBAL_CONST. Note that:

By convention, in C, global data item names oare usually presented using upper case characters.
stdio.h is a library (header) file that contains IO functions.
Global data items are declared at the top of a program outside of ann functions or procedures.
Like Java a C program must include a function main from which prhgramming commences.

The prgram otputs the four data items, performs some (mixed mode) arithmetic, and then outputs the same data items.

#include < stdio.h >

int       GLOBAL_VAR = 0;
const int GLOBAL_CONST = 1;

/* Main function */

int main(void) {
    int       localVar = 2;
    const int localConst = 3;

    printf("%d, %d, %f, %d\n",GLOBAL_VAR,GLOBAL_CONST,localVar,localConst);
    GLOBAL_VAR = GLOBAL_CONST + (localVar*localConst);
    printf("%d, %d, %f, %d\n",GLOBAL_VAR,GLOBAL_CONST,localVar,localConst);
    }

Table 1: C program

The code in Table 2 does exactly the same thing as that presented in Table 1 except that it is written in Ada. Note that:

The code, in common with all Algol style languages, is block nested, i.e. functions/procedures are nested within one another to any level such that they are all encased in one "top-level" function/procedure.
The reference to CS_IO is reference to a non-standard library package that performs Ada IO (like Java input/output is not straight forward in Ada).

with TEXT_IO;
use  TEXT_IO;

procedure TOP_LEVEL is
    package INT_INOUT is new INTEGER_IO(integer);
    use     INT_INOUT;
    GLOBAL_VAR   : integer  := 0;
    GLOBAL_CONST : constant := 1;
    
    --------- SECOND LEVEL PROCEDURE ---------
    procedure PROC_1 is
        LOCAL_VAR   : integer := 2;
	LOCAL_CONST : constant := 3;
    begin
        put(GLOBAL_VAR);
	put(", ");
	put(GLOBAL_CONST);
	put(", ");
	put(LOCAL_VAR);
	put(", ");
	put(LOCAL_CONST);
	new_line;
	
	-- Sum
	
	GLOBAL_VAR := GLOBAL_CONST + (LOCAL_VAR*LOCAL_CONST);

	-- Output
	
	put(GLOBAL_VAR);
	put(", ");
	put(GLOBAL_CONST);
	put(", ");
	put(LOCAL_VAR);
	put(", ");
	put(LOCAL_CONST);
	new_line;
    end PROC_1;	

---------- TOP LEVEL --------
begin
    PROC_1;
end TOP_LEVEL;

Table 2: Ada program

Finally Table 3 gives a Pascal program with the same functionality as that presented in Tables 1 and 2. Note here:

The reference to output is a again a reference to library files.
Data items are order in a declaration section.
Initialisation is not permitted.
Pascal, like Ada, is block nested language where global darta items are defined in the "outer most" level.

program myProg(Output);
{Example program}
const
    globalConst = 1;
var
    globalVar : integer;

    procedure proc1;
    {second level procedure}
    const
	localConst = 3;
    var
        localVar   : integer;
    
    begin {Proc_1}
        localVar := 2;
        write(globalVar);
	write(', ',globalConst);
	write(', ',localVar);
	writeLn(', ',localConst);
	
	{Sum}
	
	globalVar := globalConst + (localVar*localConst);

	{Outwrite}
	
	write(globalVar);
	write(', ',globalConst);
	write(', ',localVar);
	writeLn(', ',localConst);
    end; {Proc1}	

begin {myProg}
    globalVar := 1;
    proc1;
end. {myProg}

Table 3: Pascal program

7. ANONYMOUS DATA ITEMS

Often we use data items in a programme without giving them a name, such data items are referred to as anonymous data items. Examples include (in C):

4 + (5*6)                       number + (5*6)

Here the arithmetic sub-expression, (5*6), is processed first and the result stored somewhere as a data item which has a value and an address. However it has no name, consequently such a data item is known as an anonymous data item. The significance is that anonymous data items cannot be changes or used again in other parts of a program. Conceptually we can think of an anonymous data item as a data item without a name (Figure 4), consequently we cannot access its address (or its value).

Figure 4: An anonymous data item

To summarise the above:

Data items can be variables or constants.
Variables or constants can be global or local.
Data items are "introduced" using a declaration statement.
Data items have an "initial" value associated with them through a process known as initialisation.
The value associated with a variable can be changed using an assignment operation.

8. RENAMING (ALIASING)

It is sometimes useful, given a particular application, to rename (alias) particular data items, i.e. provide a second access path to it. This is supported by some imperative languages such as Ada (but not C or Pascal). Ada Example:

ITEM: integer = 1;
ONE: integer renames ITEM;

Here we have declared a data item ITEM and then allocated a second name (ONE) to the item. Thus, conceptually, renaming creates a data item with more than one name (but only one address and consequently only one value), i.e. we have more than one access route to the address (Figure 5).

Figure 5: Aliasing

It is dangerous to rename variables. Consider the following Ada programme:

with CS_IO; use CS_IO;

procedure RENAME is
        ITEM : integer;
        ONE  : integer renames ITEM;
begin
        get(ITEM);

        put("ITEM = "); put(ITEM); new_line;
        put("ONE = "); put(ONE); new_line; new_line;

        ITEM:= ITEM*10;
        put("ITEM = "); put(ITEM); new_line;
        put("ONE = "); put(ONE); new_line; new_line;

        ONE:= ONE*10;
        put("ITEM = "); put(ITEM); new_line;
        put("ONE = "); put(ONE); new_line;
end RENAME;

The result will be as shon below!

ITEM = 1
ONE = 1
ITEM = 10
ONE = 10
ITEM = 100
ONE = 100

Note that any change made to data item ITEM results in an identical change to data item ONE (and vice versa). This is because both names represent the same data item.

9. OVERLOADING

Whereas renaming (aliasing) binds more than one name to one data item, overloading binds two or more data items (usually of different types) to one name (Figure 6).

Figure 6: Overloading
Care must be taken not to cause ambiguity!
Neither Ada or C explicitly support overloading, however:
1. In C overloading can be implemented by declaring a compound type known as a union.
2. In Ada the same effect can be contrived using an appropriately defined record.
More generally overloading can best be illustrated by considering the `+' and `-' arithmetic operators available in all imperative languages.
Traditionally the operands for these operators are overloaded to allow both integer and floating point addition and subtraction:

 5 + 3                   5 - 3
5.6 + 3.2               5.6 - 3.2

We say that the `+' and `-' operators are overloaded.
C also allows mixing of operands (Ada does not) - so called mixed mode arithmetic:

 5 + 3.2               5.6 - 3
5.6 + 3                   5 - 3.2

REFERENCING

Bal, H.E. and Grune, D. (1994). Programming Language Essentials. Addison-Wesley.

Return to imperative home page or continue.

Created and maintained by Frans Coenen. Last updated 03 July 2001