Interviews by Stephen Ibaraki, I.S.P., DF/NPA, CNP, MVP
Paul Bassett, I.S.P. (ret.): Leading Software Engineer and Computer Science Authority
This week, Stephen Ibaraki has an exclusive interview with Paul Bassett, a leading software engineer and Computer Science authority.
Paul Bassett has given keynote addresses around the world, and was a member of the IEEE’s Distinguished Visitor Program from 1998 - 2001. He was General Chair for the Symposium on Software Reuse held May 18-20, 2001, held in conjunction with the International Conference on Software Engineering.
Ed Yourdon called Paul's book, Framing Software Reuse: Lessons from the Real World (Prentice Hall, 1997), "the best book about reuse I've seen in my career." DeMarco and Lister republished his 1987 IEEE paper, Frame-based Software Engineering, in their compilation of the 27 most significant papers of that decade. Paul also co-authored the IEEE’s Standard P1517: Software Reuse Lifecycle Processes.
Paul has over 35 years of academic and industrial software engineering experience. He taught computer science at York University for seven years, co-founded Sigmatics Computer Corporation and Netron Inc. (two on-going software engineering companies), and for over twenty years he helped governments and businesses (a partial list: the US and Canadian federal governments, The Hudson’s Bay Co., IBM, Fiserv, TD Bank, Ameritech, Union Gas, Teleglobe Insurance, Noma Industries) to improve their software development tools and techniques. He has a M.Sc. (U. of Toronto Computer Science), and is a CIPS information systems professional (retired). He is currently a senior consultant with the Cutter Consortium http://www.cutter.com/index.shtml .
Paul received the Information Technology Innovation Award from the Canadian Information Processing Society (CIPS) for his invention of frame technology. He later co-chaired their Certification Council, was a member of their Accreditation Council, and now helps to accredit honours programs in computer science and software engineering, as well as chairing the CIPS Committee on Software Engineering Issues.
The latest blog on the interview can be found the week of April 24th in the Canadian IT Managers (CIM) forum where you can provide your comments in an interactive dialogue.
Q1: Paul, what are software engineering’s chronic concerns and how can you cure them?
Since our industry’s inception over 50 years ago, software systems have been notoriously late, over budget, and of mediocre quality. Already low, programmer productivity has actually worsened since “object oriented” programming techniques went mainstream [According to project auditor Michael Mah of QSM Associates in a private communication to me.] In what looks like a desperation move, organizations have been in a race to replace their expensive local software talent with cheaper offshore labour. Yes, this tactic can save money in the short run, but off-shoring does nothing for our industry’s chronic inability to deliver quality results in a timely fashion. Indeed, it could well make matters worse.
Clearly, something is wrong with this picture. We don’t import food from countries that till their fields with water buffaloes. Why? Because modern agribusinesses have made the transition from being inefficient craft industries to producing cornucopias of low cost, high quality food. The software industry can and should do likewise. Why? Because, as we shall see, a key enabler – so-called frame technology (FT) with its associated processes and infrastructure – has been fulfilling this dream in diverse commercial settings for more than twenty years! What is wrong with this picture is our penchant for rejecting ideas that challenge deeply held core values, even when those ideas can cure chronic concerns.
I’ll illustrate by challenging a cherished myth of our craft: only human programmers can modify software intelligently. The fact is, people are poorly suited for making technical code changes, especially when the changes are both numerous and interrelated. This is precisely the situation that pertains to the cascades of program changes that are logically entailed by altered requirements. It should come as no surprise that machines can excel at fussy, detailed symbol manipulation. Several frame technologies now exist that are designed to formalize software adaptability. [For detailed comparisons of several FTs, see http://www.informatik.fh-kl.de/~eisenecker/polite/polite_emtech.pdf and http://www.cs.york.ac.uk/ftpdir/reports/YCS-2003-369.pdf Two FTs I know well are a free-ware FT called XVCL (XML-based Variant Configuration Language) http://sourceforge.net/projects/fxvcl and Netron-Fusion (www.netron.com), a commercial suite of tools set that includes an FT.] FTs modify software much faster than human programmers, and they do so tirelessly and without error, not to mention much more cheaply than the off-shore variety. What sounds too good to be true is that software’s entire life cycle benefits, not just construction. But more on this later.
An analogy helps explain how a typical FT works: Imagine a generic fender, one that can be automatically fitted to any make or model of car while it is being assembled. Such adaptability, applied to each type of part, would eliminate combinatorial explosions of part variants; conventional manufacturing would be revolutionized. Alas, such parts cannot exist in the physical world. But such adaptive components, aka frames, make perfect sense in the abstract world. Each frame can arbitrarily adapt any of its sub-component frames, and conversely, be adapted by other frames. FT assembles each software product from its parts in a manner analogous to conventional manufacturing, but because FT exploits adaptability far beyond what physical parts can have, it is a technology with demonstrated revolutionary potential.
Another way to understand FT: Most word processors can convert form letters into personalized variants. Generalize this 2-level approach to handle any number of levels, allow any kind of logic to control each adaptation, and allow any kind of information, including control logic, to be adapted at each level. Now you have a frame technology. Note: a FT allows you to use any language(s) to express yourself, including English.
The universal stratagem for creating frames is called same-as-except: How often have you heard: “Y is like X, except …”? “Binoculars are like telescopes except …” “Programming a VCR is like setting a digital alarm clock, except …” This manner of knowing things is ubiquitous, and drives the design of all frames. Making a frame is easy:
X is now a frame such that an unlimited number of Y frames can adapt it by overriding just those defaults that Y needs to change (to suit Y’s context). The idea generalizes: “Z is like Y, except …” where Z’s exceptions can override details of both X, and Y’s exceptions to X. Also possible: “Y is like W + X, except …” And on it goes.
A small group of frame engineers design, build, and evolve frames. They do these tasks in concert with application developers who design, assemble, and evolve systems out of frames. Developers, in turn, work in concert with users and other system stakeholders who debug their requirements by operating said systems. And the whole process iterates. Indefinitely.
Over time, a library of mature frames evolves that captures more and more of the organization’s intellectual assets, leaving less and less to be done from scratch. Most user-requests become small deltas from frameworks that are already on the shelf. At that point all you need is a small core of frame engineers, together with appropriate IT professionals, who understand business needs well enough to resist frivolous or counterproductive requests.
Even better, only the most expert solutions are captured in standard frames, thereby maximizing the quality of every application. Software development is transformed from a craft to an exercise in asset investment and management. And those assets continue to produce ROIs for the organization long after the originators have been promoted or hired away.
Other life-cycle benefits include: Requirements-gathering becomes a process of test driving industrial strength prototypes, settling on what features need to be added, modified, or deleted, and iterating on this two step cycle until the results are accepted by all stakeholders. This approach allows users and technical professionals to understand each other with a speed and a precision that conventional requirements gathering cannot match.
Testing: While you must test thoroughly and at every level just as you always have, the likelihood of errors is dramatically lowered. Moreover, most errors that do occur can be diagnosed and repaired much faster because the causes of error are almost always localized in the system’s novel elements – typically 5% - 15% of the total code-base that is permanently segregated from the other 85% - 95% that comes from robust frames, tested and debugged in earlier systems. The combined effect of fewer errors and faster repairs is that far less time and money is spent in the testing phase.
Last, but certainly not least, let me mention maintenance, the phase of the life cycle that historically accounts for 80% of system costs. The frame paradigm eliminates application maintenance as a distinct methodology and mind-set because the system is engineered to evolve from the outset. Remember that you only maintain the novel parts of the system (5% - 15%). Whenever a frame needs to evolve, the changes can always be expressed invisibly to existing systems. Hence, you are never forced to retrofit. On the other hand, should you want to retrofit, the “where used” list tells you exactly where to rebuild and retest. Even better, a change that is needed by all systems is often localized in a single frame – make the change once, then rebuild, recompile and retest all the instances of use.
Because the anecdotal evidence sounded too good to be true, I asked software project auditors, QSM Associates, to conduct an objective study. It was fully funded by nine corporate participants (most were named in the report). QSM compared 15 FT projects to industry norms, finding that on average project schedules shrank by 70% and costs were cut by 84%. QSM has seen nothing like these statistics for any other approach, before or since, not to mention improvements in quality, customizability, maintainability, reliability, portability, etc.
As I mentioned above, what’s wrong with this picture is the challenge to our core values. FT not only flies in the face of our conceit (that we are the only agents smart enough to modify programs), it also threatens jobs – both managerial and technical. The number of programmers will continue to shrink; just as we need far fewer farmers now than in decades past even though demand for food is higher then ever. The good news is that those willing and able to thrive with automated adaptability can repatriate the wherewithal to resume our place as cost-effective IT professionals.
Q2: Paul that was an intriguing discussion about Frame Technology, and about why, in spite of independent evidence that it is highly effective, the approach has not spread virally through the software industry. You also mentioned OO a couple of times. I would really like to know how the Frame approach compares to the Object approach. Are they collaborators or competitors?
Stephen, you’ve asked a very interesting question. When OO started becoming popular I wondered the same thing. My examination of the ideas underlying the two approaches shed light on a number of important issues that I would like to share with you.
The short answer to your question is that frames and objects are in “coopetition”; that is they are both co-operators and competitors. Frames and objects work together successfully, using languages such as Java, C++ and C#. Unfortunately, the OO paradigm creates enormous amounts of what I call gratuitous complexity. So much so that you can climb considerably higher mountains with just FT applied to any decent 3GL in a runtime environment that supports dynamically linked libraries (DLLs). In particular, FT offers a simpler, more robust way to define the architectures of complex systems, such as enterprise systems, and web-based interoperable services requiring high security and high performance.
Q3: Those are very provocative claims, Paul. I would like to see you defend them.
In order to do that, I need to review some of OO’s fundamental principles. Class hierarchies partition information domains using the "isa" relationship – a retail customer is a kind of customer is a kind of agent. “X isa Y” really means X is a proper subset of Y. That is, classes are analogues of sets in set theory, which is why a subclass must inherit all the properties of its super-classes, leading to Scott Guthrie’s famous quote in Dr. Dobbs Journal: "You get the whole gorilla even if you only needed a banana." Of course, this propensity to inherit too much begat the OO design principle that classes should be kept small.
Q4: I agree, but everyone familiar with OO knows this. Why do you bring it up?
It has two serious consequences. First, any domain has what I call a “natural graininess” – these grains are the agents, data structures, states, and state transitions that people use to understand and design systems with. For example, customers, suppliers, products, and services reflect the natural graininess of business applications. Such systems never have more than a few dozen such grains. But the small-class design principle can easily partition domains into many thousands of classes, far below their natural graininess level. This atomization is a major source of gratuitous complexity, limiting the size and sophistication of the systems we can aspire to.
A second consequence of basing classes on set theory is that one can easily drown in a sea of polymorphic look-alikes, a pernicious source of gratuitous redundancy.
Q5: Okay, I’ll bite. What is a polymorphic look-alike?
Suppose you need Paul, an object that is just like an existing object Peter, except that Peter inherits a property that is inappropriate for Paul. To create Paul requires what’s called a polymorph: a new class in which the offending property can be created afresh to suit Paul. Paul’s class may have to be attached considerably higher in the inheritance tree than Peter’s did, causing many properties to be redundantly re-inherited. Peter and Paul may end up being 95% identical, but now you have two different lineages that must be maintained as if they had nothing in common. Often it is very hard to figure out if a given class has precisely the properties you need, or how similar polymorphs actually differ from each other. Multiply this by the number of object types and you drown in a sea of polymorphic look-alikes. This, by the way, is a reincarnation of the fatal problem with '70s-era subroutine libraries. Not coincidentally, OO originated soon after!
Q6: Yes, but is it even possible to avoid a problem that has plagued us since the beginning?
Well, let’s look at how FT handles polymorphs. Suppose we have a base frame, Singer, and 3 variants: Peter, Paul, and Mary. Singer is as large as necessary to express the typical properties of singers. Peter expresses deltas to Singer that differentiate Peter from typical singers: subtractions and modifications as well as additions; Paul and Mary do likewise. Suppose each differs from Singer by 5% on average; then you end up maintaining 115% of Singer rather than up to 300%. In OO, gratuitous redundancy grows in proportion to the number of polymorphs; in FT, gratuitous redundancy is easy to avoid from the get-go. Even better, there is no confusion about exactly what properties a polymorph has, and no need to re-factor class hierarchies in order to reduce redundancies.
Q7: Hmm. I see what you mean and I like what I’m hearing. Are there more fundamental aspects of OO that cause modelling problems?
Yes, at least two more aspects. The first is aggregation, OO’s principal technique for composing objects from objects. These are component relationships, also known as “hasa” relationships. Aggregation spawns two modelling problems: (1) having partitioned a domain with "isa" relationships, we now must further partition it with hasa relationships, thus further atomizing the domain. (2) Aggregation works at run-time, yet typical hasa relationships are static, hence ought to be modelled at construction-time. For example, our arms, legs, and necks are not dynamically linked to our torsos. Linking such components together at run-time is an inaccurate and inappropriate way to model body plans. While I can easily contrive counter-examples, components in general should be seamlessly integrated at construction time, something that goes against the grain of OO.
Q8: So, how does FT handle hasa relationships?
FT's grain makes the hasa relationship its primary partitioning strategy. That is, frames are components of run-time objects, intended for seamless integration with other frames at construction time, where almost all component relationships belong.
Q9: But what about the isa relationships? How do frames handle abstractions?
Abstraction is achieved via generic frame parameters. All this happens at construction time, thereby reducing run-time overheads, and allowing the natural graininess of the domain to be modelled directly. Dynamic linkages can be handled by DLLs, though of course each DLL object or executable can itself be assembled from frames.
Q10: This is interesting. If I heard you right, you are saying there is no double partitioning of information domains, as happens in OO, and that hasa relationships dominate the partitioning. This indeed is very different from the way OO models domains. In addition to avoiding double partitioning, does the hasa-dominated modelling approach offer other advantages?
I claim that hasa is more stable, hence a superior place to begin modelling objects. This is because starting with "isa" requires us prematurely to decide which abstractions to model. Abstractions exist primarily in the minds of modellers, who often disagree as to which abstractions constitute the "essential" or defining attributes of objects. Combine such disputes with the fact that there are an infinite number of ways to generalize any given instance or example, and you can see how easily OO modelling can float off into la-la land. On the other hand, there are only a finite number of ways to cleave a specific object into pieces, and it's much easier to agree on how an object decomposes into parts because it can be cleaved before being abstracted. Moreover, because component relationships are static – typically invariant over the life of an object – they provide a stable skeleton on to which to hang the functional flesh, and provide the places where variants can arise (i.e., abstractions) as a result of demonstrated need.
Q11: Earlier you said there were two more fundamental aspects of OO that cause modelling problems, but you’ve only mentioned one of them: aggregation. What is the other one?
The other one is multiple-inheritance. MI is widely acknowledged to be a problem, but not many people understand why. The reason MI is so problematic in OO comes right back to treating classes as sets. Set theory defines sets in a brittle manner; for an element (object) to be a member (instance) of a set (class), it cannot violate any defining property by even one iota. The analog of MI in set theory is set intersection. If x is to belong to sets A and B, it must satisfy (inherit) all A's defining rules and all B's defining rules. If A’s and B’s defining rules clash in even the most minor way, the intersection is the null set. In real systems, parent classes come from different lineages, thus can easily have clashing properties. The analogue of a null intersection is a class of incompatible properties. Crossing horses with donkeys produce sterile mules because there are too many incompatible genetic traits.
Q12: Alright, so how do frames avoid this rather universal problem?
Rather than set theory, frames are based on semi-lattices, using the adapt partial ordering. The FT analogue of multiple inheritance is frame assembly. Whereas MI is problematic for OO, automated adaptation of frames with clashing properties is the bread-and-butter of FT, exactly what they are designed to do. Frames embody a very different epistemology – way of knowing – from set theory.
Q13: That’s a real mouthful! Please expand on what you mean using simpler terms.
A frame is an archetype – the dictionary definition of archetype is simply “a model or example from which all similar things are derived.” Archetypes, unlike sets, have undefined boundaries; can even overlap in unplanned ways. A frame is sort of “at the center" of its fuzzy set in the sense that all instantiations of a frame represent members of its set. Thus when partitioning domains we don't need to worry about getting the grain boundaries right; that is, the frame may not anticipate all the properties it needs, and the properties it does have may not be "correct" for any particular instance. A rule of thumb is that if you have to adapt (i.e., modify and/or delete) more than 15% of the frame's properties in order to get the instance you want, then there should be a better frame to use as a starting point. Thus, modelling becomes a matter of determining the archetypes that characterize the architecture and connectivity of the domain, using the domain’s natural graininess to model the objects. If you don't know where all the variation points are (i.e., abstractions), no problem. New ones can be added later "for free"; in the sense that the framing process guarantees that it will remain compatible with all its prior instantiations. This kind of evolvability allows for a great deal of human fallibility, which is what SE sorely needs.
Final Comment: Thank you, Paul. This interview sheds new light on some very old problems. I can see where the frame approach scales up to levels of complexity that would overwhelm the OO approach. It’s clear that many many systems could benefit from frames, and I look forward to hearing more about its successes.