>>> On Mon, 23 Aug 1999 20:24:50 GMT, k...@orbital1.demon.co.uk (Karl
>>> Harbour) said:
karl> On 23 Aug 1999 00:02:08 +0100, pierc...@Dial.PIPEX.com (Piercarlo
karl> Grandi) wrote:
piercarl> All Date is doing is repeating the fairly ancient requirement
piercarl> that a DBMS to be called relational must have an extensible
piercarl> type system, i.e. the DBMS must allow the DBA to define new
piercarl> types (domains) other than numeric, text, currency, time,
piercarl> memo, ... (and other builtin ones), and new associated
piercarl> operators.
karl> He was trying to do a bit more than that. To quote Date from the
karl> article: "The question is how to integrate the good ideas of
karl> object-oriented database with relational ideas." (*)
But that's in effect the same thing I said. The "good ideas of OO
databases", as he has argued many times, are having an extensible (OO of
course) type system. An extensible type system (and this in practice
means an OO one) is an essential requirement of the relational
model. So, we rejoice.
Most existing relational databases have a good idea, relation based data
modelling, and a bad one, an inflexible domain type system. Most
existing OO databases have a good idea, an extensible type system for
data domains, and a bad idea, network-style data modelling. Date simply
would like for the OO type system to appear in the context of relation
based data modeling systems; this is getting the good idea of OO
databases into relational databases.
It is also making relational databases become more fully relational,
because an extensible type system is an essential requirement of the
relational model. More databases that are styled as "relational" out
there, including most popular commercial ones, should be really called
``quasi-relational'', for they do not fulfil some of the most important
requirements of the relational model, among them the ability to definite
new domain types.
piercarl> There is no suggestion that a relation like 'employee' should
piercarl> be transformed into one such atomic type; indeed it would be
piercarl> abhorrent, because a relation
piercarl> e.g. 'employee(name*,dept*,salary) conveys semantic
piercarl> information that a type such as 'struct employee { string
piercarl> name, string dept, unsigned salary; }' does not.
karl> Did you read the article to which I was referring? Date says:
Date> Now, you have two ways to represent employees. You can have them
Date> represented by rows and tables, as we typically do in a relational
Date> system. Or you can have a domain of employees.
I actually read it, but I was commenting on what _you_ have written,
which does not make much sense, not what Date has written, which
does. What you had written is:
karl> I'll stick with my favourite example of employees in
karl> departments. How would you relate ("join") an employee to the
karl> department the employee works in?
karl> If I understand correctly, you can't, because the database views
karl> employees and departments as indivisible, atomic objects [ ... ]
In this it is not Date, who instead writes "How do you choose? That's
not my problem", but _you_ who is assuming that in some particular case
you want to do joins involving an "employee" relation having a single
field of domain type "employee".
This is abhorrent, as I have explained; a tuple in a relation has data
modeling semantics that are completely absent in a class type. If you
want to do a join you dot it between relations, not between a relation
and a domain via a field of the relation and a subfield of a domain.
On rereading your lines above, let me try to ascribe perhaps too much
importance to a ``freudian'' slip in what you write:
karl> How would you relate ("join") an employee to the
======
karl> department the employee works in?
The slip here is to use "relate" when describing a join. Well, in the
relational model the _only_ way to relate two data items is to put them
in the same relation; there is _no other way_. In a schema (in some
imaginary, simplified, DDL, like the other examples below) like
domain unit: string
domain person: string
domain amount: currency
relation employee(empname*:person,worksin:unit,salary:amount)
relation department(deptname*:unit,managedby:person)
The only things you can say is that 'empname', 'worksin', and 'salary' are
related, and 'deptname' and 'managedby' are related; that's it. There is no
_explicit_ relationship between 'worksin' and 'deptname' (then you could
add integrity constraints that define _implicit_ relationships, but not
a _relation_).
Now in order to relate data elements in the relational model one puts
them as fields of a relation; if they are independent fields in the same
relation, they are related, if not, they aren't.
In other words, this is the relational model's core feature, that
inter-relation access paths are not part of the schema; any application
may materialize any such path that is valid, such as the one involved in
the [equi]join between 'employee' and 'department' on their fields
'employee.dept' and 'department.deptname', which is valid as the domains
of the two fields are the same, and there is a suitable 'operator ='.
Now that one puts inside a domain type or as a relation field is a
schema design decision. For example the above schema may well be
rewritten as:
domain unit: string
domain amount: currency
domain person: class name:string, worksin:unit,salary:amount end
employee(emp*:person)
department(deptname*:unit,managedby:person)
In some applications this might well be appropriate. However in such a
schema one, _by design_ cannot join 'employee' and 'department' on
'employee.emp' and 'department.deptname', because their domain types are
different; unless of course one extends the above by defining an
'operator =' for 'person' that takes a 'unit' parameter.
Date> How do you choose? That's not my problem.
karl> I find the last sentence an amazing cop out.
Why? He is perfectly correct: that's not _his_ problem. Whether in a
schema "employee" is a relation, which can be joined to another
relation, or a domain, which cannot, depends strictly on context.
It is easy to imagine situations in which "employee" needs to be a
relation, and others in which it can be handled as a domain.
His problem is to point out (using rather imprecise language -- rows and
tables!) that the _third_ alternative, confusing domains with ``tables'',
is abhorrent.
It is the DBA's problem instead, depending on the circumstances, to come
up with a schema that supports appropriate data semantics; this may
involve choices.
A value may be a subfield of a domain, or a field of a relation; this is
a fairly crucial schema design decision, in part similar to the decision
of whether to relate two fields (put them in the same relation) or not
(put them in separate relations).
For example, instead of
domain unit:
class
name:string
established:date
operator =(b:unit) return name = b.name
end
department(deptname*:unit,managedby:person)
one may want to have:
domain unit: string
domain since: date
department(deptname*:unit,established:since)
managed(dept*:unit,managedby:person)
in which one has made the 'established' field visible at the data
modeling level, thus for example allowing one to check that HR records
are correct by verifying that employees have been transferred to a
department after it has been established, and the 'managedby' field is
not longer related to 'deptname' within 'department' (and thus not to
'established', but within 'managed', which for example allows modeling
that a department has no manager (or multiple managers if we make
'managedby' part of the primary key).
karl> Clearly, if you store employees by rows and tables, there's
karl> nothing OO about that!
Really? on what kind of bizarre argument is this assumption based?
Consider the following schema:
domain person: string
domain amount:
class
magnitude:integer
operator =(b:amount) return magnitude - b.magnitude
operator <(b:amount) return magnitude < b.magnitude
operator Dollars() return magnitude/100
operator Cents() return magnitude%100
end
domain unit:
class
name:string
established:date
operator =(b:unit) return name = b.name
end
employee(empname*:person,worksin:unit,salary:amount)
department(deptname*:unit,managedby:person)
This looks pretty OO to me. At the same it is relational.
The OO bit is that we have clearly described what an 'amount' or a
'unit' are and what kind of operation we can perform on thsoe domain
types; the relational bit is that then we state relations among
'empname', 'worksin' and 'salary' on hand hand and 'deptname' and
'managedby' on the other.
Again I'll say: if one wants to represent employees as an atomic domain,
fine; if one does, and then marvels that one cannot join a relation and
a domain or two relations using as key a part of an atomic value, then
one has understood _nothing_ about the relational model.
karl> Date answers his own question (*)
Date> The relational model is so solid and so robust; to quote the
Date> manifesto, 'The relational model needs no extension, no
Date> correction, no subsumption, and, above all, no perversion' in
Date> order for it to accommodate the good ideas of OO. Another way to
Date> say the same thing is that the good ideas of object-oriented are
Date> completely orthogonal to the relational model.
karl> Well I don't think this has been successfully demonstrated at all,
karl> especially when you also consider the way inheritance is (not!)
karl> dealt with.
And it should not be dealt with. There no necessity for a concept like
inheritance in a data model, and it just happens that the relational
model is a data model that indeed does not need it.
Inheritance is a concept pertaining to code/type
...
read more »