Minimal IT - Minimal integration 1: what is integration?

4 October 2005

Minimal integration 1: what is integration?

System integration is not just about connecting systems, and has nothing to do with networks. It is about balancing the need to get systems to work together with the need to change them independently.

Over the years I have accidentally specialised in systems integration, the art of getting computer systems to work with each other. I have collected a set of approaches and principles which I call "Minimal integration". Over the next few weeks I want to share these with you.

Firstly I want to tackle the seemingly obvious question, "what is integration?"

This is the most critical question in integration. Most of the problems with integration are conceptual, not technical. If we approach integration with the correct mindset, it becomes much simpler and much cheaper. We must understand what we are trying to achieve.

The obvious answer to the question "what is integration?" is that it involves connecting computer systems so that they can work together.

When you connect two systems, it becomes difficult to change them independently. For example, if one system uses the database of another, then any changes to the data definitions, database technology or availability has a knock-on effect. In large interconnected systems you can get to the stage where you can not change anything without changing everything.

To me, integration has to be not merely connection, but connection that allows the systems to continue to be changed independently. This is what we mean by "decoupled" integration. There are four dimensions to decoupling:

Technology. Systems should be unaware of the type and location of technology that supports other systems, and any system should be free to change its technology without affecting others.
Time. Systems should be unaware of the timing of processing on other systems, and any system should be free to change when it runs without affecting others. Sometimes two systems have to work at the same time because they co-operate in a single activity, but this is the exception rather than the rule.
Data. Systems should be unaware of the internal data representations used by other systems, and any system should be free to change its internal data without affecting others.
Process. Systems should be unaware of the processing within other systems, and any system should be free to change how it processes without affecting others.

By this definition, integration is the connection of systems such that differences in technology, time, internal data and processing are hidden. This guarantees that the connected systems can be changed independently.

This definition of integration has nothing to do with networking. Sometimes integrated systems run on the same physical computers, and sometimes on different ones. The only relevance that this has to integration is that the connection method must allow for it, and that the location of one system should not impact the other.

There are many technologies for connecting systems which do not provide this level of decoupling. These are appropriate when we do not need to change systems independently. The example above of the shared database may be perfectly acceptable, or may cause problems, depending on the context.

Next week I will describe how we need to define system boundaries so that we can decide whether to use proper integration techniques, or whether mere connection is appropriate.