z-logo
open-access-imgOpen Access
Analysis and Optimization of Set Expressions
Author(s) -
R. Nigel Horspool,
L. W. Dunkelman
Publication year - 1982
Publication title -
the computer journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.319
H-Index - 64
eISSN - 1460-2067
pISSN - 0010-4620
DOI - 10.1093/comjnl/25.3.340
Subject(s) - pascal (unit) , computer science , set (abstract data type) , algorithm , mathematical optimization , mathematics , programming language
The Pascal programming language, of all the commonly used languages, is unusual in providing sets. A program may manipulate sets of integers (or sets of any scalar type) and perform the standard set operations of union, intersection and difference. In addition, Pascal defines a set membership test, set equality tests, set inclusion tests and a function for calculating the cardinality of a set. Some less widely available languages that provide sets and set operations are SETL, MODULA, EUCLID and SUE. Of these, all but SETL owe much of their character to Pascal. Many different data structures are suitable for the implementation of sets. Possibilities include data structures based on linked lists, on binary trees,on 2-3 trees, on trie memory and on hash tables.' However, the most convenient data structure from the compiler writer's point of view is a bit vector. With such a representation, a one bit in index position i indicates the presence of / in the set (and a zero indicates its absence). Of course, it is not mandatory for the first bit to correspond to index value 1 (or 0); nor is it necessary for set elements to be integers (we can use the internal binary representation of almost any datatype to index the bit vector). Bit vectors provide the representation that is most economical in storage if the range of permissible element values is not great. Furthermore, set operations can be directly compiled into logical instructions that are almost always available on the target computer. For example, a set union may be compiled into an 'OR' instruction. The Pascal Report (Ref. 9, report section 14) advocates the bit vector representation for sets and it explicitly permits implementors to limit the allowed ranges of element values for this purpose. If bit vectors are used to implement sets, how large should these vectors be? In several Pascal implementations, including the original Zurich compiler for the CDC 6000, all sets occupy exactly one word of memory. For the Zurich compiler, this leads to the restriction that the elements of sets must have (internal) values that lie in the range 0-58 [Ref. 9, user manual section 13.C]. Giving all sets the same storage size and picking some convenient unit, such as a word or a doubleword, for this size is unnecessarily restrictive. On some computers this may have unfortunate consequences. A typical problem

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom