1 May 2007

Long-lived systems: monitoring

By Andrew Clifford

If you do not monitor your systems, they will die.

Imagine you have the perfect system. It uses all the latest, viable technology. It is beautifully structured and thoroughly documented. It is decoupled from all other systems. Every time it is changed, the code is refactored to keep it clean, and the documentation updated.

Good though this would be, it is not enough. The system could die quickly.

Although everything within the system is being managed, you also have to look at the external forces on the system. How do you know your perfect system is not being undermined: by technology shifts, changes in the law, new business strategies, and so on. You might carry out change perfectly, but how do you know when and what to change?

And of course, most systems are far from perfect. You have had to make tactical technology decisions, the systems are not well structured and decoupled, and the documentation is skimpy and out-of-date. Everything gradually breaks down as the system grows older. When should you do something about it, and rework the system?

You should change the system as soon as change is needed. But how do you know? In the real world, how do you set priorities for your limited maintenance budget?

You need to find out when and where change is needed, and to prioritise spend. This should be based on what you are trying to achieve: your IT objectives. You can measure your IT against your objectives to find the gaps. This shows where gradual decline is becoming critical. You can change the measures to reflect changing priorities, such as new technologies and new business strategies. This shows the impact of external forces.

As well as showing you when and where to intervene, measurement directly encourages long-lived systems. Measurement makes you split your IT into separate measurable chunks, which makes it easier to manage. You can measure systems for the effectiveness of ownership, for decoupling, for technology viability, and for design. Measurement does not just say, "This needs fixing", but it can also say, "Do this and it won't need fixing again soon".

Measurement helps in many ways. It can make sure that you get outsourced systems back in a better shape than when you handed them over. It lets you manage cross-system topics, like security, to follow through on investments and enforce policy. With a little care, you can even use measurement to estimate the return on your maintenance spend. You can show how spending money today will stop you spending more tomorrow.

Measurement lets you start from where you are. Few systems are perfect, but they are not all dreadful. Measurement shows which systems are OK, which need significant rework, and which are so bad they need to be abandoned. If you want to get organised, measurement is where to start.

System governance provides the measurement and monitoring that you need. It shows you when and where you need to change your IT to keep it forever young.