`
`IIUMMUNIBATIONS onue
`
`
`
`L
`
`~
`
`..
`‘I
`
`(F ‘-
`
`‘r;.__
`9'11
`.7’;
`
`">‘n
`
`‘
`
`‘
`
`I iA
`
`‘
`
`I
`
`'
`
`13
`
`1%‘:-_~
`
`_
`
`;.»~‘="--
`
`T
`
`J
`
`'
`
`If:
`-5." :7”:
`-- *-
`IZ‘l\H_*-.§.§\!:_[§ .'-r‘4 .:
`
`(cid:51)(cid:79)(cid:68)(cid:76)(cid:71)(cid:3)(cid:20)(cid:19)(cid:21)(cid:23)
`P1a1:d}.0254; %
`.54’
`4’.
`
`
`
`
`
`
`
`
`
`names, and part numbers are. We shall call the get '0,
`values represented at some instant the active domain at than/i
`instant.
`‘
`
`Normally, one domain (or combination of domains) of at
`given relation has values which uniquely identify each e1e_
`ment (n—tuple) of that relation. Such a domain (or com!’
`bination) is called a primary key. In the example above.‘
`part number would be a primary key, While part coloi-
`would not be. A primary key is nonredundant if it is either-I
`a simple domain (not a combination) or a combination
`such that none of the participating simple domains is
`superfluous in uniquely identifying each element. A 1-e13.
`tion may possess more than one nonredundant primary
`key. This would be the casein the example if difierent parts
`were always given distinct names. Whenever a relation
`has two or more nonredundant primary keys, one of them
`is arbitrarily selected and called the primary key of that re.
`lation.
`
`
`
`,
`
`A common requirement is for elements of a relation to '
`cross-reference other elements of the same relation or ele.
`ments of a different relation. Keys provide a user—oriented
`means (but not the only means) of expressing such cross.
`references. We shall call a domain (or domain combina-
`tion) of relation R a foreign key if it is not the primary key
`of R but its elements are values of the primary key of some
`relation S (the possibility that S and R are identical is not
`excluded). In the relation supply of Figure 1, the combina-
`tion of supplier, part, project is the primary key, while each
`of these three domains taken separately is a foreign key.
`In previous work there has been a strong tendency to
`treat the data in a data bank as consisting of two parts, one
`part consisting of entity descriptions (for example, descrip-
`tions of suppliers) and the other part consisting of rela-
`tions between the various entities or types of entities (for
`example, the supply relation). This distinction is diflicult
`to maintain when one may have foreign keys in any rela-
`tion whatsoever. In the user’s relational model there ap-
`pears to be no advantage to making such a distinction
`(there may be some advantage, however, when one applies
`relational concepts to machine representations of the user’s
`set of relationships).
`So far, we have discussed examples of relations which are
`defined on simple domains—domains whose elements are
`atomic (nondecomposable) values. Nonatomic values can
`be discussed within the relational framework. Thus, some
`domains may have relations as elements. These relations
`may, in turn, be defined on nonsimple domains, and so on.
`For example, one of the domains on which the relation em-
`ployee is defined might be salary history. An element of the
`salary history domain is a binary relation defined on the do-
`main date and the domain salary. The salary history domain
`is the set of all such binary relations. At any instant of time
`there are as many instances of the salary history relation
`in the data bank as there are employees. In contrast, there
`is only one instance of the employee relation.
`
`The terms attribute and repeating group in present data
`
`base terminology are roughl_v analogous to simple domain
`
`Volume 13 / Number 6 / June, 1970
`
`
`
`In many commercial, governmental, and scientific data
`banks, however, some of the relations are of quite high de-
`gree (a degree of 30 is not at all uncommon). Users should
`not normally be burdened with remembering the domain
`ordering of any relation (for example, the ordering supplier‘,
`then part, then project, then quantity in the relation supply).
`Accordingly, we propose that users deal, not with relations
`which are domain-ordered, but with relat2'onshz'ps which are
`their domain-unordered counterparts? To accomplish this,
`domainsgmust be uniquely identifiable at least within any
`given relation, without using position. Thus, where there
`are two or more identical domains, we require in each case
`that the domain name be qualified by a distinctive role
`name, which serves to identify the role played by that
`domain in the given relation. For example, in the relation
`component of Figure 2, the first domain part might be
`qualified by the role name sub, and the second by super, so
`that users could deal with the relationship component and
`its domains—sub.part supeiupart, quantz'ty—without regard
`to- any ordering between these domains.
`To sum up, it is proposed that most users should interact
`with a relational model of the data consisting of a collection
`of time—varying relationships (rather than relations). Each
`user need not know more about any relationship than its
`name together with the names of its domains (role quali-
`fied whenever necessary).3 Even this information might be
`oflered in menu style by the system (subject to security
`and privacy constraints) upon request by the user.
`There are usually many alternative ways in which a re-
`lational model may be established for a data bank. In
`order to discuss a preferred way (or normal form), We
`must first introduce a few additional concepts
`(active
`domain, primary key, foreign key, nonsimple domain) .
`and establish some links with terminology currently in use
`in information systems programming. In the remainder of
`this paper, we shall not bother to distinguish between re-
`lations and relationships except where it appears advan-
`tageous to be explicit.
`Consider an example of a data bank which includes rela-
`tions concerning parts, projects, and suppliers. One rela-
`tion called part is defined on the following domains:
`(1) part number
`(2) part name
`(3) part color
`(4) part weight
`(5) quantity on hand
`(6) quantity on order
`and possibly other domains as well. Each of these domains
`is, in effect, a pool of values, some or all of which may be
`represented in the data bank at any instant. While it is
`conceivable that, at some instant, all part colors are pres-
`ent, it is unlikely that all possible part weights, part
`
`2 In mathematical terms, a relationship is an equivalence class of
`those relations that are equivalent under permutation of domains
`(see Section 2.1.1).
`3 Naturally, as with any data put into and retrieved from a com-
`puter system, the user will normally make far more effective use
`of the data if he is aware of its meaning.
`
`330
`
`Communications of the ACM
`
`«