School of Computer Science THE UNIVERSITY OF BIRMINGHAM CoSy project CogX project Commercial Free Philosophy

Open source and Public Funds
(A different reason for the benefits of open source:
How to promote public, collaborative, mutually informative problem solving.)
Aaron Sloman
Last updated: 23 Jul 2009; 4 Feb 2010; 19 Jun 2010
Installed: 24 Jan 2009

Note Added 26 Sep 2009:
I have just discovered that the UK Tory party has been reported as having recognized the importance of the idea proposed below and also in my discussion of the flaws in the national health service IT project and how to avoid them, posted on this web site in 2006.

Their policy is reported here:

http://www.channelregister.co.uk/2009/01/27/tory_linux_push/
Posted in Enterprise, 27th January 2009 15:46 GMT
The Tory party will if elected end government over-spending on IT projects by simply choosing open source alternatives and splitting projects up, it believes.
Did they read my notes, or independently reach the same conclusion?
Hmmm. Does that make the Tories the only intelligent party in the UK???
Amazing thought!

Background

Some time ago I reacted to news reports and email discussions among Heads and Professors of Computer Science concerning the disastrous national IT project for the National Health service, whose problems some computer scientists thought (and probably still think) can be significantly reduced by using more powerful tools, including rigorous testing methodologies.

Without disputing the claim that some improvement could be produced by using better tools and engineering design methodologies, I argued that the problems were much deeper and were concerned with the near impossibility of determining requirements in advance for such a large and complex project.

I produced a large web site producing a variety of theoretical and empirical arguments, emphasising the difficulty of finding out requirements for complex systems prior to building them. That is not a problem that can be solved by starting with formal methods and tools.

A related discussion occurred on a computing mailing list early in 2009, and I tried to make the point again, but much more briefly (though it is still too long). I present the summary argument below. Later I'll add some qualifications and clarifications.

New Presentation of the Key Ideas

Microsoft has improved a lot in recent years

As someone who has always disliked MSWindows (I'll refrain from
listing my gripes here but see this) but who has to interact with it
because my wife uses a wonderful orienteering map-making package (OCAD)
that runs only on windows, I have noticed a huge improvement as
regards stability/reliability between the years she was using
Windows 2000 and the use of XP in the last couple of years. So
whoever brought about those improvements has to be congratulated.

Microsoft certainly took note when customers complained about
security problems as the number of PCs connected to the Internet
grew.

Actually some critics predicted the problems in advance, but at that
stage it seems Microsoft paid no attention. They have now fortunately
changed significantly for the better.

E.g. when my wife's machine became infected with some nasty malware
a few weeks ago I was able to use a free microsoft service here

    http://onecare.live.com/

as part of the disinfection process. (I think it required the use of
two other tools as well, though I was struggling with no real
knowledge of how MSwindows works so I don't really know what finally
removed all the symptoms.)

She certainly finds that XP is much more robust than the Windows
2000 system she had previously used.

Technical quality is not enough
But it's one thing to improve the technical quality of software,
including security and dependability: the question of 'fitness for
purpose' is a separate issue.

This is terribly important because fitness has many levels and many
aspects and some of the most important aspects cannot be formalised,
and many of those that can are simply unknown at the beginning of a
big project because of the complexity of the problems, the novelty
of what is being attempted, the variety and number of users, the
features of the social systems, equipment and institutional cultures
that already exist, and the diversity of opinions and preferences.

So even if a government agency came up with a very detailed and
rigorously defined set of requirements and even if a large computing
company produced a system that provably conformed to that
requirements specification, that would not establish fitness for
purpose because the requirements specification failed to meet the
intended more abstract requirements (which may even have been
inconsistent, or unattainable for other reasons).

This is a terribly important point that I think academic computer
scientists prefer not to have think about.

The point can be made in many ways, the starkest being that
rigorously proving that some system meets its requirements
specification, says absolutely nothing about the quality of the
requirements specification (apart from the laudable precision
that makes a formal proof possible).

The specification may be seriously unfit for the ultimate purpose,
or collection of purposes of the project.

I suspect that that is the main flaw in the UK NHS IT project, and
the tools offered by computer scientists will not, as far as I know,
help to remedy that particular flaw. They may help to remove other
flaws, e.g. frequency of crashing, or failure to do what a
government agency specified. But that does not mean the system will
therefore serve the interests of all the relevant subsections of the
community: patients, doctors, nurses, patients' relatives, ambulance
drivers, paramedics, hospital managers, etc.

When computing systems have to interact in depth with other bits of
the universe, whether they are chemical plants, weather systems,
complex flying machines, a national air traffic system, a hospital,
a school, a social service, an epidemic, a railway system, or human
brains, it is totally impossible to come up with any demonstrably
correct requirements analysis for a system that may only come into
operation several years later and is expected to go on being used
for years after that.

The only way to deal with the unattainable requirements specification
issue for large projects, especially projects of national
importance, is to accept from the beginning that
    the process of design and implementation is a significant part of
    the process of finding out what the requirements are.
The implications of that are very deep.

The need for open, collaborative, research and problem-solving
In general, that kind of research can no more be done effectively by a
single research team than research in hard problems in the sciences
and social sciences.

That requires not one large monolithic project with a strict
specification worked out in advance, but a lot of experiments done
in parallel, to find out what the needs are, what the unexpected
behaviours of various bits of the universe are like, and which sorts
of designs (if any) actually work well in which sorts of contexts,
where working well may itself be something for which standards
change as users learn about what is possible, and discover through
experience what they like and dislike, and as highly creative
criminals and mischief-makers discover new opportunities for their
activities.

The development of the internet (warts and all) is an existence
proof that this sort of anarchic process can produce an amazingly
complex, powerful, useful, albeit flawed, system. Many of the flaws
are not flaws in the system, but in a subset of humans who use it.

People don't know in advance what they will want
When the early processes of development of the technology that made
the internet possible started nearly 40 years ago, it would have
been impossible, to devise a set of requirements for the internet as
it actually developed. People don't know in advance what they will
want when it becomes available.

As someone who has developed user software on a small scale I
sometimes found that asking people in advance what they would like
produced incorrect information, about what they really would like
when faced with it. Sometimes my guesses about their preferences
proved better than theirs! Mockups can help, but static mockups are
no substitute for experience with a prototype.

Knowledge distribution and intellectual property
If the recommendation to replace large monolithic product
development processes wherever possible with a lot of smaller scale
exploratory problem-solving processes going on in parallel, is
accepted, that raises another problem: knowledge distribution.
Things discovered in one experiment need to be made available for
use by other experimenters -- and their end users, to maximise the
benefits of new knowledge for everyone.

This development process requires mechanisms for knowledge transfer
so that what is learnt in different places can spread to where it is
needed, including lessons about what doesn't work, and what some of
the consequences of failure can be.

The rapid growth of the internet after the basic technology had been
developed would have been impossible but for very public sharing of
ideas and solutions and rapid testing by people who were not the
original developers, including testing of modifications of
modifications, etc., and spinning off rival systems that may be far
superior to heroically produced early versions. (Remember the first
browsers?)

Compare the growth and spread of the programming language php.

That means, I think, that, whatever corporations do in their own internally
funded projects, all public funding for systems development, as for
scientific research, must impose an obligation to make the results
freely available to all, including other developers, and including testers
in non-computing disciplines who are interested in the applications -- not
just testers recruited and paid by the original developers or the original
funders.

Every system will have some flaws.
Among the many unforeseen problems of development of the internet
are the growth of spam, and the new opportunities for criminal
activities of many kinds.

Dealing with the problems that arise from deployment of a new system
need not always involve changing the system.

It is not primarily the fault of a river if someone uses it to poison
the fish caught downstream by a rival. But that only means that the
design of the system needs to be linked to adjunct services as new
problems, starting outside the service, turn up (e.g. river patrols,
and water testing may have to be added).

NOTE:
    The BBC Radio 4 Analysis Programme on Inspiring Green Innovation
    On Monday 6th July and Sunday 12 Jul 2009, the BBC Radio 4 Analysis
    programme broadcast an investigation into alternative ways of
    simulating new development in energy producing and using
    technologies required to avert the looming environmental crisis.
    http://www.bbc.co.uk/programmes/b00lg8hg
    "Inspiring Green Innovation"

    I was interested to notice that they reached conclusions very
    similar to the conclusions I had reached regarding major IT
    developments, namely that there should not be monolithic centrally
    funded projects but many different shorter projects run in parallel,
    with the possibility of learning from them and terminating the
    unsuccessful ones.
    They did not mention two points I had stressed:

        (a) The need to ensure that contracts do not allow the companies
        employed to retain intellectual property developed with government
        funding: the results of both good and bad experiments must be
        available to all, in order that maximum benefit can be extracted from
        them.
        (b) The development of the internet between the early 1970s and the
        end of the century illustrates all my points.

How will this affect costs?

This may mean taxpayers paying more for what is actually developed
in order to make the intellectual property public, but paying for it
in much smaller chunks, so that early results are freely available
for others to try to use and improve on, including others who may
wish to invest in developing improved versions without public
funding, in order to provide commercial products or services on the
basis of those improvements. (The form of licence should permit
this.) So the wide availability of early and intermediate results
will be of enormous public value (assuming this is not a project
that absolutely has to be kept secret, e.g. because of national
security issues). The value gained will include:

(a) other developers not having to reinvent the good ideas in order
to provide a useful starting point for some new good ideas that
did not occur to the original contractor;

(b) errors and failures resulting from mistaken assumptions made by
the original requirements specification and other errors and
failures can feed valuable information into future investigations,
so that they avoid those mistakes (both design mistakes and adopting
mistaken requirements/targets);

(c) perhaps most importantly, if the contracts are relatively short
term and results are open this gives governments the option to
switch future contracts to developers with excellent new ideas,
instead of being stuck with the original developer whose
impoverished ideas have not been adequately exposed.

In short, the higher expenditure on early prototypes, in order to
keep intellectual property in the public domain may be more than
offset by both lower costs in later developments (because errors are
not repeated) and much higher benefits achieved because of the
regular opportunities to switch attention, and funding, to new
developers with new promising ideas.

Someone may like to prove a conjectured theorem: the costs of
particular pieces of publicly-funded technology developed on this
model will usually be higher than the cost of the same technology
developed in a conventional monolithic project (because the IP will
need to be paid for), but the total value for money cost will be
much higher in the long run because there will be far less
expenditure on very large and inadequate systems, and the social
benefit of the good small open products will far greater for the
whole community than the benefit of similarly good items forming
part of a closed monolithic product.

Moreover, if production of usable freely available documentation is
part of the contract, suppliers will not be able to save money by
skimping on documentation in a way that may go undetected
internally, but can lead to serious problems later on, e.g.
difficulties of maintenance.

In part the problem is management of expectations

If the IT companies are bidding for large long term contracts they
are tempted to make promises that nobody could possibly keep because
nobody knows enough about the problems and the requirements at that
early stage.

If all they are contracted to do during early phases of a project is
do exploratory work and produce public code and documentation, and
reports on testing, they will not need to raise false long term
hopes and any promises they fail to keep will become visible at an
early stage.

Computer scientists don't like to think about all these issues: it
is much more intellectually exciting to be able to represent a
problem formally, solve it, and prove formally that it has been
solved. But acknowledging, as some experienced software engineers
do, that that is not a complete fix, and in some cases may be only a
small part of the real problem, requires changes in the way claims
are presented by the research community about how to deal with the
problems of public procurement.

One of the causes of the high quality of the Ocad package mentioned
above, was that its main developer (recently deceased alas) was also
a user of all the different services it provided: help with doing
surveys to get map data, reading in and aligning/undistorting
sketches and aerial photographs as bitmaps to provide background to
the map under development, creating and editing maps of various
sorts, producing different competitive courses based on the same
map, and printing out the information to be given to planners,
controllers, and competitors for each course, printing the maps in
different ways for different sorts of orienteering events, etc. and
finally using a map in running on a course. The main designer had
deep 'user' knowledge of all the different uses: I believe he was a
map surveyor, map maker, course planner, orienteering competitor,
etc., as well as being a software engineer.

It is rare that users have the skills and knowledge to be
developers, so alternative ways of incorporating user expertise need
to be developed, and that requires computer scientists interested in
developing complex applied systems to acquire deep knowledge of and
work closely with experts in other fields -- physics, chemical
engineering, mechanical engineering, biology, aerodynamics,
meteorology, various kinds of manufacturing process, medicine, human
psychology, hospital management, primary school teaching, or
whatever. (Even philosophers in some cases, e.g. where ethical
issues are involved or where new uses require old concepts to be
clarified and modified.)

Governments and procurement agencies have to change.

More importantly, it requires major changes in government policies,
the ways publicly funded developers operate and their expectations,
the scale of projects funded at national level, and especially
replacement of an ethos of commercial competition with one of open,
cooperative (while competitive), problem solving for the general
benefit, possibly without ever selecting a single global solution
for any of the major problems, since that can stifle further
learning.

[Although standards and deep integration can be a good thing,
diversity and the freedom to assemble components from different
sources can also make things harder for hackers and criminals, as
well as providing more seeds for future development. Has anyone
analysed the tradeoffs between the benefits of uniformity and the
costs of vulnerability and rigidity? Compare diversity in a gene
pool.]

Occasionally there are public administrators who understand the need
for flexibility. Around 2000 our department had a large grant to
enable us to acquire a multi-cpu linux computer grid to support our
research. The procedures stipulated by EPSRC for selecting a
supplier required us to specify what we wanted and then find out who
could supply it at the cheapest price. We attempted to do this and
then found that the tenders offered were not comparable because the
different suppliers made different guesses as to how much money we
had available, and how we wanted to divide it between different
pieces of technology available, and also tried to second-guess some
of the things that would impress us about their products. I realised
that the procedure was badly broken, and asked EPSRC for permission
to change the process, by telling all bidders exactly how much
money we had available, and the sorts of things we wanted to do, and
then inviting them to specify what they could provide for that sum
of money.

At first there was strong opposition to 'breaking the rules' but
fortunately there was an intelligent person at EPSRC who decided to
take the risk of giving us permission to go ahead. The result was
that we had much clearer offerings from the suppliers, including
suggestions for different ways of spending that some of money on
different combinations of their products and services. It was then
much easier for us to take a sensible decision about how to proceed,
and we ended up with a grid that provided an excellent service for
several years.

I hope similar intelligence and flexibility exists in other parts of
the civil service and government. Such flexibility could lead to
major, highly beneficial, changes in large scale procurement
procedures.

Even Microsoft ??

I have the impression that even Microsoft is beginning to understand
the importance of open, shared, problem-solving and learning, and is
gingerly(?) moving in that direction:

E.g. see:

    http://www.microsoft.com/opensource/
    http://www.phpclasses.org/blog/post/85-What-is-Microsoft-up-to-with-PHP.html
    http://port25.technet.com/archive/2008/12/17/how-microsoft-will-support-odf.aspx

This seems to me (as an outsider with little specialist knowledge)
to indicate a seismic shift in culture at Microsoft.

[Note Added 23 Jul 2009
    See http://resources.zdnet.co.uk/articles/comment/0,1000002985,39689353,00.htm
    Microsoft's magnificent seven open-source options

    Rupert Goodwins ZDNet.co.uk

    Published: 22 Jul 2009 15:00 BST

    Microsoft's magnificent seven open-source options

    "Now Microsoft has officially decided that the GPL is a good thing
    and is using it to release code for Linux, it's time for the
    software company to take advantage of the many good things that
    being a member of the open-source club brings. It's not quite the
    Berlin Wall coming down - not yet - but reunification may be on the
    cards....."
]

It may be too little too late, but the long term benefits, if
continued, could ultimately exceed all the good done by the Gates
Foundation! It will lead to the intellectual fragmentation of
Microsoft, and that can only be a good thing, if the fragmentation
produces multiple sub-communities within Microsoft engaging
fruitfully with multiple external communities, for mutual benefit.

No doubt other big companies that have taken steps in this direction
will continue to do so.

Perhaps the process will be facilitated by shareholders having
learnt the hard way recently that trying always to maximise short
term profits may not be the best long term policy, as that causes
the system to create imaginary profits whose unreality will sooner
or later be exposed.

Can governments and the civil service be educated too?

Appendix: A Conjecture about microsoft

Some time ago Apple realised that going on extending their own operating system
was not sensible because it was never initially designed for multi-user
computers linked in networks where security and other concerns had to be built
in to the system from the bottom levels.

So they gave up their own operating system, took on an open source version of
Unix (BSD, very like linux), gave it a new name, and tried to help their
customers through the painful process of conversion, in part by emulation.

Microsoft has employed large numbers of highly intelligent people. They are no
longer encumbered by the high priest who would resist major changes analogous to
the Apple strategy. Perhaps they too have learnt the great benefits of open
source, namely allowing multiple experiments to be pursued in parallel with
results being publicly comparable so that rapid learning is possible, and have
also learnt the benefits of an operating system that from the start was designed
for multiple users (we used Unix as a multi-user operating system on a DEC
PDP11/40 from about 1975 at Sussex university -- though in some respects it was
not then and still is not as good as some older multi-user operating systems
designed for mainframed, e.g. ICL's George 4 and DEC's VMS).

If they are moving towards a new more functional, more maintainable, more
extendable design with major differences from their legacy windows operating
systems, then all the above experiments engaging with open source development
communities could be part of a strategic plan to develop a much better new
operating system.

How will they then deal with users who don't want to give up their own software
or wait for new implementations?

Simple: let them do what many linux users do now: they run linux on a powerful
but inexpensive PC with a lot of core memory and disc space, and they run
windows software in a virtual machine that provides the functionality required.
This could be wine, or vmware or some other system.

If linux users can do that now, and Microsoft know it, then I am sure they are
intelligent enough to see the huge potential strategic developments in the long
term if they follow the same strategy.

Will they too develop a variant of Unix? Maybe they would prefer to go back to
one of the more sophisticated multi-user operating systems, e.g. supporting
multiple privileges (unlike Linux/Unix systems). Perhaps VMS, which is now
OpenVMS?

But there is not so much know how among world-wide software development
communities that they could build on if they took that route -- because it is
not nearly as widely used as linux/unix and their variants.

See also: Use Free Open Access Journals

Partial index of my 'misc' pages.


Maintained by Aaron Sloman
School of Computer Science
The University of Birmingham