Friday, August 25, 2006

I Got Voted off the Software Engineering Island?

It is very easy to be negative about interfaces. 

First, an interface demands that the implementing class provides logic for each and every method in the interface but in return provides the implementing class with no mechanisms to create the logic.  Each time that you set out to create a class that implements the interface, it is a brand-new ballgame.  The only inheritance that you have is the "cut and paste" variety.  Implementing an interface is not for the faint of heart.

Second, even when you have implemented all of the methods, if the interface changes, everything that touches the interface has to change right along with it.  Compliance with an interface is a binary thing: it is either completely implemented or not.  There is no partial compliance with an interface; no "the check's in the mail" promise to buy yourself out of trouble temporarily.  This means that if you do change an interface that has gotten out into the wild, every program that is involved with this interface must change immediately.  That is often very hard to do. 

All of this coupled with the caveats listed in "So You Think You Can Dance" might lead you to think that interfaces are more trouble than they are worth: it is time to vote them off the island. However, interfaces are like almost anything else, used correctly and in the proper proportions, and can add a great deal to the application.

My purpose in writing this entry is to cover some of the positive aspects of interfaces and offer some advice about how to use interfaces in a way that reduces the number of negative effects.

One of the major benefits of using interfaces is that they can appear anywhere.  A given class can implement dozens of interfaces (although that's reason for suspicion about that class).  Most importantly because many programming languages do not support multiple inheritance (for many good reasons), the only way to get common behavior defined in different object hierarchies is to introduce interfaces.  This is one the of the ways by which one can get polymorphic behavior across multiple object hierarchies.

Ideally, an interface is compact and sharply focused.  One of the major reasons why some developers end up putting in dummy implementations for some of the methods required by interface, is that the interface is too comprehensive; it tries to be too many things to many people.  A great interface does one and only one thing and every method in the interface is essential to support that one thing.  If you find that developers are creating dummy methods to satisfy the requirements of interface, it is time to rethink the interface to sharpen the focus and make it more compact. Because it is possible for a given class to implement multiple interfaces, there really is minimal reason to keep the interface that addresses two or more separate "topics" as a single interface.

Although I think that this next recommendation carries a little less weight than the "compact and sharply focused" recommendation, I think that is very useful to build interfaces that don't maintain state.  Because of the lack of ability to specify the semantics concerning the order in which these methods should be called, there is a real danger that different consumers of the interface and different implementers of the interface will come to different conclusions about how state is maintained.  Again because there is no reference implementation for the interface, any possibility for misinterpretation of the intent of the interface should be avoided. 

Consider the situation where the interface provide you with a method to perform some operation which may possibly fail.  The operation could return a Boolean value to indicate success or failure.  If the method failed, the consuming class could then call another method on the interface to get back the error message and possibly extensive debugging information.  This is dangerous for two reasons, first, it requires the object implementing the interface to hold state values that may never be used (there is no way to force the consuming class to call the "get error data" methods); second, there may be confusion about how long that state value is to be held.  Other calls to other methods could overwrite that state.  It is much better to create a "results object" that can hold all of the results from the function including multiple error messages, if they exist, and an indicator of the success or failure of the operation.  Each operation would return an instance of this "return status" object, thus releasing the implementing object from having to maintain that state data.

In the case of our document management system, the need to maintain state across multiple methods in the interface can be quite complex.  The sample interface that we provided in A Not so Simple Example allowed the client logic to walk through the tree structure that held the table of contents.  While this might on the surface appear to be something akin to a simple enumeration of an array, I have personal knowledge of how complex this can be.  Consider the case where the table of contents is scattered throughout the document in many different files.  Each time that the implementing object returns from a "get next table of contents entry", the implementing object has to capture a lot of data about where it was in the document.  It may even be the case that the implementing object has to retrace the steps each time that the "get next table of contents entry" method is called.  It may turn out that it's far better for both the implementing and consuming objects to allow the implementing object to loop through the data structure and call a delegate within the consuming object for each entry.  This does not complicate the consuming object logic very much, if at all, and can vastly simplify the logic within the implementing object.

I believe that "truth is only understood through argument."  What I mean by this is that one has to work through all of the issues and considerations around a particular truth before one can understand it.  The "truth" cannot be merely handed down and accepted blindly.  The development of an interface is very much the same type of thing.  High quality interfaces are typically the result of a negotiation between the consuming objects and the implementing objects.  Before the definition of an interface is released out into "the wild", the development team should have created two or three objects that implement the interface.  Doing a single implementation is probably not sufficient.  The line of demarcation between the consuming object and the implementing object is somewhat arbitrary.  One could easily add more functionality to the implementing object or move some of that functionality to the consuming object.  It is not until you have created two or three objects that implement the interface that you begin to understand where the commonalities and variabilities are.  In my experience each new class that implements interface brings new insights as to how that interface should really be defined.  Even if these are somewhat simple-minded test implementations, they provide valuable insight particularly if they are done by different people with different points of view.  There is an essential creative tension between the needs of the consuming object and the needs of the implementing object.

Designing interfaces very much like writing one of these blog entries.  It doesn't just happen, one has to think about the issues, create some form of outline, write the document, review the contents, edit and rearrange the material so that it makes more sense, and only then can one publish the entry.  It is the responsibility of the author of an interface specification to understand the fundamental needs of the consumer class as well as the realities of trying to implement an interface for different situations.  A well-defined interface balances the needs of the consuming object against the difficulties of implementation for the implementing object.  Again that balance can only be achieved by considering multiple implementations.

A final topic is unit testing of interfaces.  Those of you who have not been lulled into a coma by this time might be holding up your hands, saying, "But wait, there's nothing behind an interface, there is no implementation.  How can you test an interface when there is no implementation?"  While that is true, the value of unit testing is just too powerful to give up just because there is no implementation.  How are we going to solve this.

First, we can get around the lack of an implementation by creating an implementation.  I often create "mock" objects to assist in the unit testing.  The mock objects keep track of what parts of the interface have been called.  The unit test sets up the processing object that is to be tested with a reference to my mock object, invokes the methods on the processing object, and queries the mock object to see if things came out the way that I expected.  Writing a mock implementation of the interface gives one more experience in implementing the interface and can yield additional insights.   

Second, we can write a set of unit tests against the interface.  Each test would invoke a method to get an instance of an object that implements the interface.  The logic of the test would be oriented toward the abstract interface rather than a specific class that implemented the interface.   For example, one could write a unit test to verify that the "get title" method always returned a non-empty string.  This is a very valid unit test.  Indeed there are those who are strong advocates of unit testing that would suggest that these unit tests are the only true specification of what the interface should do in its totality.  I certainly wouldn't want to disagree with that sentiment.  What we want to do is to be able to apply these abstract unit tests to a variety of classes that implement the interface.  If we have five classes that implemented the interface, we would want to run these same tests against each of these five classes.  We could make copies of the test class, each with a different implementation of the method that returns the object implementing the interface, but that would be wrong. 

Third, we can convert the class with the abstract unit tests into an abstract base class.   Each of these abstract test cases are marked with the [Test()] attribute.  The base class as a whole is NOT marked as a [TestFixture()] attribute.  We convert the "get instance of an object that implements the interface" into  an abstract method that must be implemented in a derived child class.  The abstract method returns an instance of an object that implements the interface.  Each of the abstract test methods in the base class would apply their tests to the return from the abstract class.  Obviously these tests would have to be somewhat generic and general in their approach but would still serve as a fairly definitive statement of what the minimum requirements for each object that implemented the interface.

Fourth, we create a derived class for each different class that implements the interface.  This class would be marked as a [TestFixture()].  Each derived child class would implement "get instance of interface" method to return an instance of the object that implements the interface. Each child class could also include tests that were more specific to the particular implementation.

This form of unit testing contributes to the negotiation process that tightens up the interface definition.

tags: , , , ,

In our next couple of entries, I'm going to present a rewritten interface that reflects some of these recommendations.


Post a Comment

Links to this post:

Create a Link

<< Home