Thursday, August 03, 2006

A journey through the stages

I want to continue on the theme of last “serious” entry in this blog concerning the stages that data might go through on their journey.

Everyone who puts material like this together comes to the issue of selecting an example to highlight the concepts involved.  If one takes an example which is very simple, the reader might get bored very quickly.  On the other hand, if one takes an example which is very complex, the reader might spend more time trying to understand what the example is all about and possibly miss out on concepts that the author is trying to present.  I’m going to choose a very simple example in hope that my readers will be able to apply this set of concepts to more complex examples.

Let’s suppose that we are entering an address.  Because the example is not as important as the concepts behind it, I’m assuming this is a United States address.  This address might be part of a much larger document that the end-user is trying to assemble.  For our purposes, I would assume the data in this address is going to go through three stages:

  • Under assembly: the user is gathering the bits and pieces of the address.  Some of the fields might not be filled in and some of the fields are filled and may be of the wrong data type.
  • Complete: the user has entered data in each of the required fields and that data is of the correct data type.  That is, for example, the zip code contains numbers rather than letters.
  • validated: the application has verified that all the data in the fields is valid.  For example, the ZIP code is an existing zip code and it matches the state that has been entered.

Let me explore these three stages into more detail.

During the “under assembly” state, we expect that the user can do pretty much anything he or she wants to do.  The application presents a form (Web or Windows based) and the user can enter any kind of data into these fields.  If the user saves the data, the data is saved and later retrieved without any kind of check on formats or lengths.

During the “complete” stage, we expect that the user will enter only data that matches the format requirements of the application. The ZIP code will contain only numbers and possibly a dash.  The application might impose some maximum lengths on the input strings.

During the “validated” stage, the user has probably completed the data entry and is moving on to other parts of the overall document.

One of the issues at this point is whether or not we’re going to allow the user to regress from one stage to another.  If the addresses in the “validated” stage, we could allow the user to enter “strange data” that would force us to retreat to an earlier stage such as “complete” or even “under assembly”.  I think it’s more likely that we would put in some kind of latching mechanism that would prevent the user from such a retreat.  This is a point though for the people gathering and documenting requirements.

If we step back far enough, the application in each of these stages is dealing with the same abstract object.  It has the properties for the various piece parts of the address.  It probably has methods for determining if the address is complete and if the address is valid.  Let’s assume that it also has a method for saving itself to the database.  I’m not sure I would do this in the real world application but for the purposes of this discussion let’s assume that that’s how it’s being done.  The question I want to think about out loud is how do we create a design that satisfies the set of requirements.

Let me take a moment to look at the user experience.  Although there are a number of possible different stages for the data, I would be surprised if we didn’t present the same interface to the user for data entry and modification at every stage of the way.  If I was a user, I would start out entering what data I had.  I might save the data at various points along the way.  It would be very nice if I got some kind of visual cue about the quality of the data, ideally on a field by field basis.  For example I’ve build applications in which the background color of the individual fields indicated the quality of the data within the field.  Missing but required data would have a background color of pink, supplied but invalid data would have a background color of yellow, and valid or optional data would have a background color of white.  The user could tell at a glance what the state of the data was.  A particular text box would start out as pink, progress through yellow, and ultimately end up with a white background.  the change in background color might happen dynamically as I typed data in board might happen only one I submit the data for some form of processing.  I also assume that there be some way for me to understand what the problem with a particular field was.  For example, it might be a tool tip that is displayed when the mouse hovers over the textbox; there are other ways that are more or less intrusive on the user experience.  As an aside, I much prefer this style of feedback over pop-up message boxes. 

I make the assumption here that we have separated the presentation logic from the business logic.  Regardless of whether we use a model-view-controller approach or simply have the presentation layer interact directly with a domain layer object, I assume the presentation layer is very thin and has only the responsibility for moving data between the business layer and the presentation controls.

Now let’s take a look at the viewpoint of the part of the application that wants to validate the contents of the address.  The question to think about here is whether the validation logic should be interested in address that is incomplete.  In other words, the issue is what are we validating: the individual fields of the address or the address as a whole.  I think both are valid viewpoints, depending upon the needs of the application.  We would have to go back and look at the use cases and requirements in general to understand what the requirement was.  If the data is coming in from a programmatic source such as a Web service or a file, we might well treat the address as a whole.  If the data is coming in interactively from a webpage or Windows form, we would probably want to provide as much input to the user as possible each time that we interacted with the user and therefore we would treat individual items of the address independently of each other, holding back on cross-field validation such as matching the state to the zip code until the address as a whole was complete.

Here is where the design starts to get a little bit interesting.  Conceptually we have a single entity that is the address.  There are multiple viewpoints of this single entity and there are behaviors (“methods”) that are only relevant to certain viewpoints.  The question in my mind is do we create a single object that exhibits all of this behavior or do we create multiple related objects including some mechanisms by which the data can be moved back and forth appropriately.  Let me add to the complexity of the problem by suggesting that the validation logic might adjust the data in an address.  For our particular example the validation logic might look up the street address and determine the full “five plus four” ZIP code as part of the validation process.  I’m going to assume that we want to show this adjusted data to the user once the input address has been validated.  In other words dataflow is not just one way; there are multiple sources of input. 

I am going to take a break this point because I have to wander off and get some breakfast and go to work in the “real” world.  In my next entry I will take a look at the tools in our design “toolbox” and see how each of these tools might be applied to this problem.


Post a Comment

Links to this post:

Create a Link

<< Home