|
A Pattern Language for Relational Databases and Smalltalk
By Kyle Brown and Bruce Whitenack
Early in 1995 we (two experienced Smalltalk programmers) began
a project in analysis and design that would tax our abstraction
abilities to their limits. The result of this ongoing exercise is
a pattern language we call Crossing Chasms. This article describes
Crossing Chasms as well as exploring the thought processes that
led us to write it, what we discovered in its writing, and how we
have used the document since its creation.
What motivated us to write a pattern language?
The business of companies like Knowledge Systems Corporation (KSC)
and The Object People is to transfer information about the process
of building object systems from consultants to clients. One of the
most common themes running through many of the object systems our
two companies have built over the past five years is the need to
integrate Smalltalk with relational database technology. We have
found that the clients of our training and consulting businesses
are extremely interested in this area, and often need guidance to
understand how these two technologies combine.
In early 1995 we were both involved in creating new material for
client-centered mentoring and classroom education. We felt the need
to include some information about relational databases, but were
uncertain as to how to organize that information
Each of the Smalltalk vendors (Digitalk, Parcplace, and IBM) had
their own, unique class libraries for handling relational database
queries. On the surface, there did not appear to be much commonality
among the three.
Over the past several years we had built many systems using Smalltalk
and relational databases with major corporate clients. KSC's first
such effort had been with a government organization in early 1992,
followed by projects for a national bank, a major telecommunications
company, a telecommunications equipment manufacturer, and a pharmaceutical
company. We had learned many lessons about building this kind of
system, and had found out what worked and what didn't. Although
each system was unique, we felt that there were some commonalties
among all of them. In fact, the design for each usually incorporated
the best ideas from all the previous ones, even though none of the
systems shared any code.
It was this desire to record our lessons learned, to be better
equipped for future projects, and to find unity among the disparate
vendor implementations, that led us to explore pattern languages
as an avenue for recording this design and implementation information.
A pattern language is a set of related patterns that guides a reader
through a set of closely linked problems and their solutions.
The pattern is a literary form invented by the architect Christopher
Alexander to describe the decisions involved in designing and building
communities and buildings. The shortest way to describe the essence
of a pattern is "A solution to a problem in a context".
It records how the interplay of different "forces" on
a particular problem can lead to their resolution in a template
solution. The pattern form was introduced into the software community
by Ward Cunningham and Kent Beck in the early 1980's. It has become
popular in recent years due in large part to the work of Gamma,
Coad, and others.
We chose to begin writing a pattern language because the pattern
form seemed to best capture the spirit of the notions that we had.
We felt that a pattern language that could lead readers in a non-linear
fashion from one topic to the next could bring together the interconnected
threads of thought that we had. It also provides a structure in
which to study the issues and their solutions by naming and isolating
the essence of each problem. We were also interested in exploring
the issues involved in writing patterns -- in this sense Crossing
Chasms was an experiment in writing a large pattern language.
How did we find our patterns?
We first wanted to identify all the issues and problems that arise
in designing and building a framework marrying relational databases
and Smalltalk.
In reviewing the process of building such a system it became apparent
that we could split the set of problems roughly in two. The problems
of defining the tables and object models we categorized as "static"
patterns. Those involved in resolving the runtime problems of object-table
mapping we put in a category called "dynamic" patterns.
We then realized that a number of the problems we were identifying
were not so much directly related to the object-table mismatch but
were really client-server issues. These problem-solution pairs were
generic enough to be applicable to any client-server architecture,
object-oriented or not, so we developed a third category ("client-server"
patterns) for them.
Lastly we saw that the decision to go with a client-server model
was just one fundamental architectural decision out of many. Many
other architectural issues must also be resolved, including the
modularization of functionality into application layers and the
choice of the number of tiers that the system would include. These
patterns we termed "architectural" patterns.
Crossing Chasms grew in size and complexity as new problems were
identified. To discover the patterns we first immersed ourselves
in the literature and subject area. We found our patterns in numerous
places. Our own experience in building systems led us to identifying
most of the major ones. Studying the documentation of existing frameworks,
both commercial and proprietary, added to the list as well. Reading
the OO literature that addressed the subject, (Rumbaugh, Jacobson,
Gamma, and others) also contributed some patterns to the list, particularly
in the static category.
Eventually after defining the basic patterns and formulating them
as a pattern language we came up with some new ones based on feedback
from our colleagues. This whole process followed the 3 I Paradigm
of mastering a subject area. First you Immerse yourself in a field.
This leads you to Imitate the solutions of others, until finally
you can Innovate and come up with your own solutions.
As mentioned above, Crossing Chasm's patterns are categorized into
four groups: architectural, static, dynamic and client-server. In
the following sections we will introduce a few of the most important
patterns in the language in their respective categories. Unfortunately,
we can only present a taste of our language as a whole. Our current
version of the language is over 90 pages long and very dense in
text and diagrams. We have discovered almost forty patterns, of
which we introduce eleven here. The presentations of the individual
patterns here are by necessity very brief; the pattern language
goes into much more depth in each pattern.
The Patterns of Crossing Chasms
Architectural Patterns
When a project needs to use both Smalltalk and relational technology
there are a group of issues at a very high architectural level that
need to be addressed. Surprisingly, we did not recognize many of
these issues until well after we had written the rest of the patterns
in Crossing Chasms. These issues so pervaded our thinking that it
took a second look at the problem to even recognize their existence.
One of the most important decisions to make about the design of
a system is its overall software architecture. This decision determines
the direction that development will take.
Pattern: Four-Layer Architecture
Problem: What is the appropriate structure
and grouping of classes in a Smalltalk client-server system? What
architecture is most appropriate?
Figure 1: Four Layer Architecture
Solution : Employ a four-layer architecture consisting
of a view layer, an application model layer, a domain layer, and
a supporting infrastructure layer (see Figure 1: Four Layer Architecture).
Determine the interfaces between the layers well ahead of time and
keep the communications paths well defined. Enforce the layering
through design and code reviews.
Layered architectures are a well-known idea in Computer Science,
but it is rare that new Smalltalk programmers see their designs
in terms of well-defined layers. Nevertheless, proper layering is
important for reusability and maintainability. Brown [96] deals
with this issue at length.
Another key decision that has to be made is the order in which
development events must occur. It is especially difficult for first-time
users of Object Technology to develop an ordered development process.
After seeing several bad decisions made in projects we had observed,
we recognized this pattern in retrospect.
Pattern: Table Design Time
Problem : When is the best time to develop
your relational database schema? In what order do object design
and schema design occur?
Solution : Design the relational database
schema based upon a first-pass object model done using a behavioral
modeling technique. It may be more prudent to wait until after an
architectural prototype has been built before designing the schema
(see Figure 2: Development Lifecycle). Remember that an OO design
is in reality a first-pass database design. Doing things in the
reverse order (schema first) often leads to a poorly factored OO
design with separate "function" and "data" objects.
Figure 2: Development Lifecycle
Static Patterns
One of the fundamental problems in developing a total enterprise
solution using Object Technology is the development of relational
database schemas from object models. We were lucky in finding that
this is a well-represented area of research that had been covered
well over the past several years. Our job in developing the static
patterns was to pick the "best of breed" of the available
approaches and integrate them into a complete, self-consistent method.
Pattern: Representing objects as tables
Problem : How do you map a set of objects
into a relational database schema? Considering that complex objects
do not map neatly into tables, objects do not have keys, tables
do not have identity, and the datatypes do not match between worlds,
how do you perform a mapping?
Solution : Start with a table for each
persistent object. Determine the "type" of each instance
variable and create a column for each that have "base"
datatypes. Use the Representing Collections pattern to handle collections.
Use the Foreign Key reference pattern to handle other non-base datatype
objects. Finally, use the Object Identifier pattern.
Pattern: Object Identifier
Problem : How do you preserve an object's
identity in a relational database? Each individual object's identity
must be preserved in the database and there should be no spurious
duplicates.
Solution : Assign an independent identifier
(called an Object Identifier, or OID) to each persistent object.
An easy way to do this is to use a sequence number generator if
one is available in your particular database. If not, an OID table
can be used. OIDs are usually simply long integers that are guaranteed
to be unique for a particular class of objects.
Pattern: Foreign Key Reference
Problem : How do you represent objects
that reference other objects that are not "base datatypes"?
The First Normal Form Rule (1NF) excludes tuples from containing
other tuples; therefore Object relationships must be represented
using only legal column values.
Solution : Assign each object a unique OID. You then add a column
for each instance variable that is not a base datatype or a collection.
In that column store the OID of the referenced object, then declare
the column as a foreign key.
Pattern: Representing Collections
Problem : How do you represent Smalltalk
collections in a relational database? The first normal form rule
of relational databases forbids tuples from containing sets of other
elements. Other properties of Smalltalk collections also prove bothersome.
For instance, objects may be contained in many collections (M-N
relationships). Also, collections have special properties (sort
order, duplicates). Finally, Smalltalk collections can be either
heterogeneous or homogenous
Solution : Create a relationship table for each collection. A relationship
table maps the primary keys of the containing objects to the primary
keys of the contained objects. The relationship table may store
other information as well, for instance the class of contained object,
or the position of object (OrderedCollection, SortedCollection).
If a collection is heterogeneous, then the class of each element
is also stored in that table.Other static patterns in Crossing Chasms
dealt with the issues of representing inheritance in a relational
database and determining to what extent a domain model must be modified
to handle database issues.
Dynamic Patterns
In addition to the static and architectural parts of Crossing Chasms,
we found it important to record what we had learned about writing
Smalltalk code to handle relational database connection. This section
of Crossing Chasms we referred to as the "dynamic" patterns,
since they deal with the movement of information in and out of the
database, as opposed to the static database schema.
One of the first patterns we recorded was Broker .
Pattern: Broker
Problem : How do you separate the domain-specific
parts of an application from the database-specific parts?
Solution : Connect the database-specific
(vendor) classes and the domain-specific classes together with an
intermediate layer of Broker objects. Brokers mediate between database
objects and domain objects and are ultimately responsible for reading
object information from and writing objects to the database.
The Broker idea is a popular one in OO circles and many papers
have been written about its use. However, it is still not being
used as often as it should. We feel that this is due in part to
poor examples in the Smalltalk vendor's documentation that tend
to show simplistic examples of database connectivity that mix domain
functionality with database functionality. Developers new to Object
Technology, or who come to Smalltalk from Visual Basic or PowerBuilder
backgrounds often miss the subtlety of why Brokers are important.
However, they are central to maintaining the integrity of the layers
in a 1 Four-Layer Architecture.
As we looked back on the broker implementations we had built, we
found that two more patterns occurred in the best implementations;
Query Object and Object Metadata .
Pattern : Object Metadata
Problem : How do you define the mapping
between the elements of an object class and the corresponding parts
of a relational schema?
Solution : Reify the mapping into a set
of Map classes that map object relationships into relational equivalents.
Map objects also map column names to instance variable selectors
in domain objects.
Pattern: Query Object
Problem : How do you handle the generation
and execution of common SQL statements and minimize the amount of
duplicated code between broker classes?
Solution : Write a set of generic classes
that generate SQL statements from common data. A hierarchy of classes
representing SQL statements can generate the appropriate SQL given
a domain object and its Map object metadata representation.
Figure 3: Broker Interactions
The three previous patterns, when combined, make up a powerful
mini-architecture. Each domain object will have a set of Map objects
that represent its object relationships as metadata. The Broker
classes that are responsible for saving and restoring those objects
can use Query Objects to generate the appropriate SQL statements
from the data held in the Maps. In this way, proper layering can
be preserved since the objects in the Domain layer are not directly
knowledgeable about the internals of the SQL generation, while the
Brokers themselves obtain information about their domain classes
only indirectly through the Map objects. A diagram of the interactions
of these classes is shown in Figure 3: Broker Interactions.
While the Broker architecture worked well to allow us to move objects
in and out of the database, the performance of some of our early
attempts was less than adequate. In particular, early versions often
spent too much time reading in data from the database that was never
subsequently used. In trying to resolve this, we found that the
1 Proxy pattern from Gamma provided us with an effective solution.
We could often use a Proxy as a placeholder for information that
had not yet been read in from the database. When that information
was needed, the Proxy would collaborate with the Broker to read
it in, and then replace itself with the new object.
Other topics addressed by the dynamic patterns included handling
database transactions and the order in which connected objects must
be written to or restored from the database.
Client-Server Patterns
As we mentioned previously, there were many issues we discovered
that were not specific to Smalltalk, or even OO in general, but
were rather applicable to any client-server systems. Two of these
patterns were 1 Client Synchronization and 1 Cache Management.
Pattern: Client Synchronization
Problem : How do you handle resynchronizing
the client image and database when there are errors? What do you
do if you change the value of data held in the client's memory and
the corresponding request to the database fails?
One solution is to just note the error to the user and flush any
cached information. In this case any error is deemed to be catastrophic
and you must start a new session. This is not a very robust solution,
but it is a quickly implementable one.
A second solution is a playback mechanism that has a logging facility.
Each change is logged in a local log. If there is a failure the
cache is flushed and you replay of all the events as needed. This
solution is more robust, but it is not trivial to implement.
Solution : Mark the objects appropriately
as deleted, added or updated during the session. If the update to
the database succeeds then remove the mark. If it fails then retry
the transaction. If it continually fails (e.g., times out) note
the error and flush the cache. With the changed objects marked it
is possible to recover to the original state by filing out the changed
objects to local storage and performing recovery at a more propitious
time.
Pattern: Cache Management
Problem : How do you best manage the
lifetime of persistent objects stored in an RDB? Caches can increase
client performance, but they also increase client memory size. Caches
can become out of date, necessitating frequent updates. Caching
also generally increases application complexity.
Solution : Use a Session object that
has a bounded lifetime and is responsible for identity cache management
of a limited set of objects. Balance speed vs. space by flushing
the cache as appropriate. Use a query before write (timestamp) technique
to keep caches accurate.
How have we used Crossing Chasms?
Since writing Crossing Chasms we have successfully applied its
patterns in a number of different instances. It has proven to be
a very useful teaching aid - we subsequently have developed several
lectures for classroom use from the pattern language. The structure
of the pattern language proved to be a useful framework for discussing
the different concepts in object to relational connectivity. The
topics of the lectures we developed from the pattern language paralleled
the organization of the language. In addition, some of the patterns
have been used as a basis for other lecture topics in our classroom
education. We have also found that students like having the pattern
language as an after-the-fact reference after seeing presentations
based on it. In this way, we can present a high-level overview and
then allow the students to investigate the deeper issues at their
own pace.
Several companies have used the patterns in Crossing Chasms as
part of their object-relational architectures as a result of our
presenting them as part of our training. We have found that addressing
the issues covered in Crossing Chasms early in the development process
can preclude many of the missteps that first OO projects often take.
We have also developed a conference tutorial based upon the pattern
language and presented it at Smalltalk Solutions '96 in New York.
Again, we have had feedback that students appreciate using the pattern
language to gain deeper understanding of the issues after the presentation.
The static portion of Crossing Chasms was presented at the PLoP
(Pattern Languages of Programs) '95 conference in September, 1995.
Those patterns have been published in Brown [96].
References
Alexander [77] Christopher Alexander, A Pattern Language , Oxford
University Press, 1977.
Brown [95] Kyle Brown, "Remembrance of Things Past: Layered
Architectures for Smalltalk Applications", The Smalltalk Report
4(9): 4-7. 1995.
Brown [96] Kyle Brown and Bruce Whitenack, "Crossing Chasms:
The Static Patterns", in Pattern Languages of Program Design
Vol. II, Jim Coplien, Douglas Schmidt and Norman Kerth, Editors.
Addison-Wesley, 1996
Coad [95] Peter Coad and Mark Mayfield, Object Models: Strategies,
Patterns and Applications . Yourdon Press, 1995
Gamma [95] Erich Gamma, Richard Helms, Ralph Johnson and John Vlissides.
Design Patterns: Elements of Object-Oriented Software , Addison-Wesley,
1995.
Jacobson [92] Ivar Jacobson, et.al. Object-Oriented Software Engineering:
A Use-Case Driven Approach , Addison-Wesley, 1992
Rumbaugh [91] James Rumbaugh, Micheal Blaha, William Premerlani,
Frederick Eddy, William Lorensen. Object-Oriented Modeling and Design
, Prentice-Hall, 1991.
|