A Pattern Language for Relational Databases and Smalltalk
By Kyle Brown and Bruce Whitenack
Early in 1995 we (two experienced Smalltalk programmers) began a project in analysis and design that would tax our abstraction abilities to their limits. The result of this ongoing exercise is a pattern language we call Crossing Chasms. This article describes Crossing Chasms as well as exploring the thought processes that led us to write it, what we discovered in its writing, and how we have used the document since its creation.
What motivated us to write a pattern language?
The business of companies like Knowledge Systems Corporation (KSC) and The Object People is to transfer information about the process of building object systems from consultants to clients. One of the most common themes running through many of the object systems our two companies have built over the past five years is the need to integrate Smalltalk with relational database technology. We have found that the clients of our training and consulting businesses are extremely interested in this area, and often need guidance to understand how these two technologies combine.
In early 1995 we were both involved in creating new material for client-centered mentoring and classroom education. We felt the need to include some information about relational databases, but were uncertain as to how to organize that information
Each of the Smalltalk vendors (Digitalk, Parcplace, and IBM) had their own, unique class libraries for handling relational database queries. On the surface, there did not appear to be much commonality among the three.
Over the past several years we had built many systems using Smalltalk and relational databases with major corporate clients. KSC's first such effort had been with a government organization in early 1992, followed by projects for a national bank, a major telecommunications company, a telecommunications equipment manufacturer, and a pharmaceutical company. We had learned many lessons about building this kind of system, and had found out what worked and what didn't. Although each system was unique, we felt that there were some commonalties among all of them. In fact, the design for each usually incorporated the best ideas from all the previous ones, even though none of the systems shared any code.
It was this desire to record our lessons learned, to be better equipped for future projects, and to find unity among the disparate vendor implementations, that led us to explore pattern languages as an avenue for recording this design and implementation information. A pattern language is a set of related patterns that guides a reader through a set of closely linked problems and their solutions.
The pattern is a literary form invented by the architect Christopher Alexander to describe the decisions involved in designing and building communities and buildings. The shortest way to describe the essence of a pattern is "A solution to a problem in a context". It records how the interplay of different "forces" on a particular problem can lead to their resolution in a template solution. The pattern form was introduced into the software community by Ward Cunningham and Kent Beck in the early 1980's. It has become popular in recent years due in large part to the work of Gamma, Coad, and others.
We chose to begin writing a pattern language because the pattern form seemed to best capture the spirit of the notions that we had. We felt that a pattern language that could lead readers in a non-linear fashion from one topic to the next could bring together the interconnected threads of thought that we had. It also provides a structure in which to study the issues and their solutions by naming and isolating the essence of each problem. We were also interested in exploring the issues involved in writing patterns -- in this sense Crossing Chasms was an experiment in writing a large pattern language.
How did we find our patterns?
We first wanted to identify all the issues and problems that arise in designing and building a framework marrying relational databases and Smalltalk.
In reviewing the process of building such a system it became apparent that we could split the set of problems roughly in two. The problems of defining the tables and object models we categorized as "static" patterns. Those involved in resolving the runtime problems of object-table mapping we put in a category called "dynamic" patterns. We then realized that a number of the problems we were identifying were not so much directly related to the object-table mismatch but were really client-server issues. These problem-solution pairs were generic enough to be applicable to any client-server architecture, object-oriented or not, so we developed a third category ("client-server" patterns) for them.
Lastly we saw that the decision to go with a client-server model was just one fundamental architectural decision out of many. Many other architectural issues must also be resolved, including the modularization of functionality into application layers and the choice of the number of tiers that the system would include. These patterns we termed "architectural" patterns.
Crossing Chasms grew in size and complexity as new problems were identified. To discover the patterns we first immersed ourselves in the literature and subject area. We found our patterns in numerous places. Our own experience in building systems led us to identifying most of the major ones. Studying the documentation of existing frameworks, both commercial and proprietary, added to the list as well. Reading the OO literature that addressed the subject, (Rumbaugh, Jacobson, Gamma, and others) also contributed some patterns to the list, particularly in the static category.
Eventually after defining the basic patterns and formulating them as a pattern language we came up with some new ones based on feedback from our colleagues. This whole process followed the 3 I Paradigm of mastering a subject area. First you Immerse yourself in a field. This leads you to Imitate the solutions of others, until finally you can Innovate and come up with your own solutions.
As mentioned above, Crossing Chasm's patterns are categorized into four groups: architectural, static, dynamic and client-server. In the following sections we will introduce a few of the most important patterns in the language in their respective categories. Unfortunately, we can only present a taste of our language as a whole. Our current version of the language is over 90 pages long and very dense in text and diagrams. We have discovered almost forty patterns, of which we introduce eleven here. The presentations of the individual patterns here are by necessity very brief; the pattern language goes into much more depth in each pattern.
The Patterns of Crossing Chasms
When a project needs to use both Smalltalk and relational technology there are a group of issues at a very high architectural level that need to be addressed. Surprisingly, we did not recognize many of these issues until well after we had written the rest of the patterns in Crossing Chasms. These issues so pervaded our thinking that it took a second look at the problem to even recognize their existence.
One of the most important decisions to make about the design of a system is its overall software architecture. This decision determines the direction that development will take.
Pattern: Four-Layer Architecture
Problem: What is the appropriate structure and grouping of classes in a Smalltalk client-server system? What architecture is most appropriate?
Figure 1: Four Layer Architecture
Solution : Employ a four-layer architecture consisting of a view layer, an application model layer, a domain layer, and a supporting infrastructure layer (see Figure 1: Four Layer Architecture). Determine the interfaces between the layers well ahead of time and keep the communications paths well defined. Enforce the layering through design and code reviews.
Layered architectures are a well-known idea in Computer Science, but it is rare that new Smalltalk programmers see their designs in terms of well-defined layers. Nevertheless, proper layering is important for reusability and maintainability. Brown  deals with this issue at length.
Another key decision that has to be made is the order in which development events must occur. It is especially difficult for first-time users of Object Technology to develop an ordered development process. After seeing several bad decisions made in projects we had observed, we recognized this pattern in retrospect.
Pattern: Table Design Time
Problem : When is the best time to develop your relational database schema? In what order do object design and schema design occur?
Solution : Design the relational database schema based upon a first-pass object model done using a behavioral modeling technique. It may be more prudent to wait until after an architectural prototype has been built before designing the schema (see Figure 2: Development Lifecycle). Remember that an OO design is in reality a first-pass database design. Doing things in the reverse order (schema first) often leads to a poorly factored OO design with separate "function" and "data" objects.
Figure 2: Development Lifecycle
One of the fundamental problems in developing a total enterprise solution using Object Technology is the development of relational database schemas from object models. We were lucky in finding that this is a well-represented area of research that had been covered well over the past several years. Our job in developing the static patterns was to pick the "best of breed" of the available approaches and integrate them into a complete, self-consistent method.
Pattern: Representing objects as tables
Problem : How do you map a set of objects into a relational database schema? Considering that complex objects do not map neatly into tables, objects do not have keys, tables do not have identity, and the datatypes do not match between worlds, how do you perform a mapping?
Solution : Start with a table for each persistent object. Determine the "type" of each instance variable and create a column for each that have "base" datatypes. Use the Representing Collections pattern to handle collections. Use the Foreign Key reference pattern to handle other non-base datatype objects. Finally, use the Object Identifier pattern.
Pattern: Object Identifier
Problem : How do you preserve an object's identity in a relational database? Each individual object's identity must be preserved in the database and there should be no spurious duplicates.
Solution : Assign an independent identifier (called an Object Identifier, or OID) to each persistent object. An easy way to do this is to use a sequence number generator if one is available in your particular database. If not, an OID table can be used. OIDs are usually simply long integers that are guaranteed to be unique for a particular class of objects.
Pattern: Foreign Key Reference
Problem : How do you represent objects that reference other objects that are not "base datatypes"? The First Normal Form Rule (1NF) excludes tuples from containing other tuples; therefore Object relationships must be represented using only legal column values.
Solution : Assign each object a unique OID. You then add a column for each instance variable that is not a base datatype or a collection. In that column store the OID of the referenced object, then declare the column as a foreign key.
Pattern: Representing Collections
Problem : How do you represent Smalltalk collections in a relational database? The first normal form rule of relational databases forbids tuples from containing sets of other elements. Other properties of Smalltalk collections also prove bothersome. For instance, objects may be contained in many collections (M-N relationships). Also, collections have special properties (sort order, duplicates). Finally, Smalltalk collections can be either heterogeneous or homogenous
Solution : Create a relationship table for each collection. A relationship table maps the primary keys of the containing objects to the primary keys of the contained objects. The relationship table may store other information as well, for instance the class of contained object, or the position of object (OrderedCollection, SortedCollection). If a collection is heterogeneous, then the class of each element is also stored in that table.Other static patterns in Crossing Chasms dealt with the issues of representing inheritance in a relational database and determining to what extent a domain model must be modified to handle database issues.
In addition to the static and architectural parts of Crossing Chasms, we found it important to record what we had learned about writing Smalltalk code to handle relational database connection. This section of Crossing Chasms we referred to as the "dynamic" patterns, since they deal with the movement of information in and out of the database, as opposed to the static database schema.
One of the first patterns we recorded was Broker .
Problem : How do you separate the domain-specific parts of an application from the database-specific parts?
Solution : Connect the database-specific (vendor) classes and the domain-specific classes together with an intermediate layer of Broker objects. Brokers mediate between database objects and domain objects and are ultimately responsible for reading object information from and writing objects to the database.
The Broker idea is a popular one in OO circles and many papers have been written about its use. However, it is still not being used as often as it should. We feel that this is due in part to poor examples in the Smalltalk vendor's documentation that tend to show simplistic examples of database connectivity that mix domain functionality with database functionality. Developers new to Object Technology, or who come to Smalltalk from Visual Basic or PowerBuilder backgrounds often miss the subtlety of why Brokers are important. However, they are central to maintaining the integrity of the layers in a 1 Four-Layer Architecture.
As we looked back on the broker implementations we had built, we found that two more patterns occurred in the best implementations; Query Object and Object Metadata .
Pattern : Object Metadata
Problem : How do you define the mapping between the elements of an object class and the corresponding parts of a relational schema?
Solution : Reify the mapping into a set of Map classes that map object relationships into relational equivalents. Map objects also map column names to instance variable selectors in domain objects.
Pattern: Query Object
Problem : How do you handle the generation and execution of common SQL statements and minimize the amount of duplicated code between broker classes?
Solution : Write a set of generic classes that generate SQL statements from common data. A hierarchy of classes representing SQL statements can generate the appropriate SQL given a domain object and its Map object metadata representation.
Figure 3: Broker Interactions
The three previous patterns, when combined, make up a powerful mini-architecture. Each domain object will have a set of Map objects that represent its object relationships as metadata. The Broker classes that are responsible for saving and restoring those objects can use Query Objects to generate the appropriate SQL statements from the data held in the Maps. In this way, proper layering can be preserved since the objects in the Domain layer are not directly knowledgeable about the internals of the SQL generation, while the Brokers themselves obtain information about their domain classes only indirectly through the Map objects. A diagram of the interactions of these classes is shown in Figure 3: Broker Interactions.
While the Broker architecture worked well to allow us to move objects in and out of the database, the performance of some of our early attempts was less than adequate. In particular, early versions often spent too much time reading in data from the database that was never subsequently used. In trying to resolve this, we found that the 1 Proxy pattern from Gamma provided us with an effective solution. We could often use a Proxy as a placeholder for information that had not yet been read in from the database. When that information was needed, the Proxy would collaborate with the Broker to read it in, and then replace itself with the new object.
Other topics addressed by the dynamic patterns included handling database transactions and the order in which connected objects must be written to or restored from the database.
As we mentioned previously, there were many issues we discovered that were not specific to Smalltalk, or even OO in general, but were rather applicable to any client-server systems. Two of these patterns were 1 Client Synchronization and 1 Cache Management.
Pattern: Client Synchronization
Problem : How do you handle resynchronizing the client image and database when there are errors? What do you do if you change the value of data held in the client's memory and the corresponding request to the database fails?
One solution is to just note the error to the user and flush any cached information. In this case any error is deemed to be catastrophic and you must start a new session. This is not a very robust solution, but it is a quickly implementable one.
A second solution is a playback mechanism that has a logging facility. Each change is logged in a local log. If there is a failure the cache is flushed and you replay of all the events as needed. This solution is more robust, but it is not trivial to implement.
Solution : Mark the objects appropriately as deleted, added or updated during the session. If the update to the database succeeds then remove the mark. If it fails then retry the transaction. If it continually fails (e.g., times out) note the error and flush the cache. With the changed objects marked it is possible to recover to the original state by filing out the changed objects to local storage and performing recovery at a more propitious time.
Pattern: Cache Management
Problem : How do you best manage the lifetime of persistent objects stored in an RDB? Caches can increase client performance, but they also increase client memory size. Caches can become out of date, necessitating frequent updates. Caching also generally increases application complexity.
Solution : Use a Session object that has a bounded lifetime and is responsible for identity cache management of a limited set of objects. Balance speed vs. space by flushing the cache as appropriate. Use a query before write (timestamp) technique to keep caches accurate.
How have we used Crossing Chasms?
Since writing Crossing Chasms we have successfully applied its patterns in a number of different instances. It has proven to be a very useful teaching aid - we subsequently have developed several lectures for classroom use from the pattern language. The structure of the pattern language proved to be a useful framework for discussing the different concepts in object to relational connectivity. The topics of the lectures we developed from the pattern language paralleled the organization of the language. In addition, some of the patterns have been used as a basis for other lecture topics in our classroom education. We have also found that students like having the pattern language as an after-the-fact reference after seeing presentations based on it. In this way, we can present a high-level overview and then allow the students to investigate the deeper issues at their own pace.
Several companies have used the patterns in Crossing Chasms as part of their object-relational architectures as a result of our presenting them as part of our training. We have found that addressing the issues covered in Crossing Chasms early in the development process can preclude many of the missteps that first OO projects often take.
We have also developed a conference tutorial based upon the pattern language and presented it at Smalltalk Solutions '96 in New York. Again, we have had feedback that students appreciate using the pattern language to gain deeper understanding of the issues after the presentation.
The static portion of Crossing Chasms was presented at the PLoP (Pattern Languages of Programs) '95 conference in September, 1995. Those patterns have been published in Brown .
Alexander  Christopher Alexander, A Pattern Language , Oxford University Press, 1977.
Brown  Kyle Brown, "Remembrance of Things Past: Layered Architectures for Smalltalk Applications", The Smalltalk Report 4(9): 4-7. 1995.
Brown  Kyle Brown and Bruce Whitenack, "Crossing Chasms: The Static Patterns", in Pattern Languages of Program Design Vol. II, Jim Coplien, Douglas Schmidt and Norman Kerth, Editors. Addison-Wesley, 1996
Coad  Peter Coad and Mark Mayfield, Object Models: Strategies, Patterns and Applications . Yourdon Press, 1995
Gamma  Erich Gamma, Richard Helms, Ralph Johnson and John Vlissides. Design Patterns: Elements of Object-Oriented Software , Addison-Wesley, 1995.
Jacobson  Ivar Jacobson, et.al. Object-Oriented Software Engineering: A Use-Case Driven Approach , Addison-Wesley, 1992
Rumbaugh  James Rumbaugh, Micheal Blaha, William Premerlani, Frederick Eddy, William Lorensen. Object-Oriented Modeling and Design , Prentice-Hall, 1991.
Knowledge Systems Corporation is a member of the Smalltalk Webring.
This Smalltalk Webring site is owned
by Knowledge Systems Corporation.
[ Previous Page | Next Page | Skip Next | List Next 5 | Random Link ]
Want to join the ring? Click here for info