17 May 2011

The curse of metadata

By Andrew Clifford

Our search for a truly general-purpose application is frustrated by inherent problems with metadata.

Last week I covered how different types of software meet requirements for use as general-purpose applications. Some, such as Excel, can not be used on a large scale. Others which can be used on a large scale, such as content management systems or reporting tools, are too specialised. Applications which are more flexible and can scale, such as Lotus Domino, are complicated internally and require specialist skills and tools to develop new solutions.

I think the reason why it is hard to find a truly effective general-purpose application is because of metadata.

Metadata is data about data. It is useful to categorise how applications use metadata into four levels.

At level 1, applications use no metadata. The data that the application processes is hard coded into the programs.

At level 2, the system software that supports the application holds metadata. For example, databases maintain a description of the data independently of the programs that access it. This makes the data more flexible. The data definitions can be modified without changing the application. It is much easier to use the data for multiple purposes, such as using it for reporting.

At level 3, the application itself holds metadata for some data. For example, a web content management system would not hard code that a web page is made up of a header, content, left column and footer, but would allow this to be defined in a template or page definition. This is then metadata, allowing the content management system to manage content for different types of web page.

At level 4, the application can query and access all the objects from which it is built, effectively exposing the entire application as metadata. This means that manually or programmatically you can modify what the application does.

It is easy to see how level 2 is an improvement on level 1. It makes the application much easier to maintain and develop.

Level 3 is a natural progression from level 2, adding some metadata flexibility to the application. But there are drawbacks. The only bits that are flexible are the bits that use metadata - you are still constrained by what the designer has built into the application. The extra layers of indirection can affect performance.

Level 4 seems to be the way to go. It is, for example, the way that Excel is structured. You can build any solution within the application. But this flexibility comes at a cost. The designer needs to have thought through every type of thing that could be done in the application. The underlying model of the application will be very complicated, which makes it hard to implement on a complicated architecture such as a multi-user web application. The in-depth use of the application is very different from normal use, and requires different skills and tools, which undermines its suitability as a general-purpose application.

Attempts to make applications general-purpose inevitably hit limitations and complexity that stop them being truly general-purpose. But I think there might be another way, which would allow a simple, effective and truly-general purpose application that can be configured for multiple purposes. I will cover this next week.