School of Computer Science THE UNIVERSITY OF BIRMINGHAM CN-CR Ghost Machine

The Reality of Computation
(Information Processing)
Computational thinking about how minds work.

(DRAFT Liable to change DRAFT)

Aaron Sloman
School of Computer Science, University of Birmingham.

Installed: 10 Oct 2014
Last updated: XXX
This paper is
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/comp-reality.html
A PDF version may be added later.

A partial index of discussion notes is in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/AREADME.html

The Reality of Computation

I often meet philosophers who try to assess the status of an explanation of the behaviour of something in terms of computational mechanisms (or information processing mechanisms) that it uses. Often this leads to an attempt to define generic criteria for deciding whether such explanations are true.

The attempts I have encountered, of that sort, do not work, and worse, can lead to misleading conclusions. A famous example is Dennett's "intentional stance" (which I think was deeply influenced by the ideas of his supervisor, Gilbert Ryle (from whom I also learnt much, though he wasn't my supervisor), but there are other cases with different intellectual histories, e.g. various forms of empiricism. I'll return to Dennett later.

Benefits of software engineering for philosophy
I shall try to show how a philosophical strategy paying close attention to the processes involved in creating (designing, implementing, testing, de-bugging, modifying, extending, deploying and explaining) human made (non-biological) information processing systems can provide an alternative approach to explaining the status of information processing (computation) that provides much deeper insight than any generic philosophical formula for truth or reality could. One example is that there is no need to assume that information processing systems with mental contents (percepts, qualia, desires, intentions, beliefs, plans) are rational, as stipulated for the intentional stance (in some formulations).

Although my work is primarily philosophy (including philosophy of mind, philosophy of language, philosophy of mathematics, philosophy of biology, and more generally philosophy of science) I have had the privilege over more than four decades of being involved in and making contributions to software development work in Artificial Intelligence. (Most other forms of intensive software development would have served the same purpose in this context.) As a result, many philosophical questions are transformed, including questions about the nature of computation (or, as I would prefer to call it, the nature of information processing) what sort of existence it has, how it is related to processes in physical mechanisms, and what sorts of causal powers information processing phenomena can have.

From the standpoint of one who is involved in designing, implementing, testing, debugging, extending and using such systems, questions about reality, causality, information contents, truth of explanations, or descriptions, look very different from the appearance they seem to have to philosophers who raise questions in their seminar presentations or publications, without that background of experience.

Reflections of a designer and user of software tools
Perhaps the most illuminating insights come from cases where a complex program does not work as expected, and sophisticated development tools are available to help with debugging. In that context, a very high proportion of questions about what is happening, what causes what, what information has been acquired, how it is being used, take on a kind of life that is invisible in the context of abstract philosophical discussions.

E.g. if I think something is happening because the program has not found a particular edge feature in an image I may be able to make the program pause at the appropriate stage while running and interrogate the data-structures created to see which edges have been found.

[This can't be done with ancient debugging tools that allow only interrogation of values of programming variables. The debugging task then becomes very different, much more complex, and much more difficult -- or even impossible.

There are different difficulties with programs implementing neural nets and other self-organising systems where the intermediate states are sometimes extremely opaque.]

If I find that the edge feature has been found, I may try to find the cause of the unexpected behaviour in a different way: has it been grouped with the wrong set of edges to form a candidate feature?

In that case the program's grouping criteria may have a simple arithmetical or logical error, or it may turn out that the design was wrong in a deeper way: e.g. perhaps the grouping mechanism does not take account of a large enough context: other things that have already been discovered in that region of the image, which form part of the context that should be used to decide on the grouping, but the program ignores them because the programmer provided an inadequate rule, or because the program's training was not effective, which might be because it was not provided with sufficiently varied data, or because the learning mechanism is too simple.

Those are distinct alternatives, which can be explored by both testing on different images and altering the program design. In some cases it may even be possible to establish what went wrong by explicit reasoning about the design.

That's a bit like discovering a flaw in a mathematical proof and fixing it.

[That's the most satisfying case for a designer, obviously.]

A deeper source of error
Another possibility is that my theory about how the edge should be grouped at that stage turns out to have been wrong, because not enough cases had been anticipated at the design stage. (Something similar could happen during evolution of an information processing mechanism, though 'anticipation' would then be the wrong word.)

I might discover the error by 'fixing the bug' in accordance with my initial hypothesis, and then discovering that something much more important sometimes goes wrong as a consequence, later on.

That could lead to the hypothesis that low level human visual mechanisms also don't use my original criteria for grouping, but perhaps use some more global context at a later stage in the processing. Then the program architecture has to be revised and tested in a wider variety of contexts.

[Marr at one stage claimed that designers of vision systems could avoid such mixtures of bottom up and top down processing, but I think he was seriously mistaken about how natural vision systems work, because he underestimated the challenges. Unfortunately, for a while many researchers believed him, despite what had been learnt before Marr joined the AI community from neuroscience, where he had done outstanding work.]

Evaluating ontologies and theories about an information processing system.
How can a programmer be sure that the theoretical language used is appropriate for formulating questions, hypotheses, proposed solutions, etc.?

Because the language (more precisely the ontology) is part of a theory that has been found as successful in this domain of science and engineering as the language of current, voltage, resistance, inductance, etc. is in dealing with electrical circuits.

It's nothing like a theological ontology of angels, devils, the causal power of prayer, etc.

But being appropriate doesn't mean being right: the language available for describing complex information processing systems has developed over the last six decades as researchers and developers discovered new features of the space of problems and solutions.

Early programmers knew nothing about the forms of programming that began to be developed in the late 1960s, and were unfortunately named "Object Oriented" programming. Notions like "inheritance", and later "multiple inheritance" became important, some of them subsumed under the notion of "parametric polymorphism". Inheritance here is not a temporal or causal relation, but a mathematical relation.

For example, numbers and the addition operator form a mathematical group, and so do spatial rotations of 3-D objects and the operation of combining rotations.

Addition of numbers and combination of rotations can be reversed, and in each case there's a special entity called an "identity" in the group, which is 0 for addition of numbers and the 'null' rotation for the group of rotations. I.e. adding 0 to any number produces the same number and combining the null rotation with any other rotation produces the original rotation.

So both numbers and spatial rotations inherit properties of mathematical groups, and any theorems discovered in group theory will apply to both sets of entities. These similarities can be used in designing programs to perform addition, and designing programs to perform multiplication.

But the two cases also differ from each other insofar as the set of numbers and the set of 3-D rotations have different structures. E.g. numbers form a set with a linear order, whereas rotations don't, unless restricted to rotations about the same axis.

Similarly different operations on various sorts of information structures may share properties inherited from generic classes of structures and operations, while differing in other respects.

Learning to build complex programs in a disciplined way involves learning to combine previously developed programming techniques where possible. (This is an important aspect of "re-use" of software that can be used to speed up development and reduce programming flaws.)

It is at least possible that biological evolution (blindly) "discovered" and used this powerful design technique long before we, its products, did. That's one of the themes being explored as part of the Meta-Morphogenesis project, as illustrated here.

As a result, a running program could simultaneously be an instance of several different programming classes instantiated via parameters that significantly alter what the programs with the same ancestors can do. Two entities with apparently similar structure in the same (computational) environment may react differently in ways that depend on their (mathematical, logical) inheritance, i.e. which classes they instantiate, rather than which objects were their ancestors (which would be temporal inheritance).

In cases like that, explaining what's going on, what works, what doesn't work, and fixing bugs, requires thinking about the types of (mathematical) abstractions instantiated, and how they could be changed. Designers doing this do not necessarily realise that they are thinking mathematically, and they may have to discover/invent new mathematical domains in the process, often unwittingly. It isn't always the case that some piece of text-book mathematics will meet the requirements of a new design problem. That's one of the reasons why good programmers require high levels of creativity (which are not easily produced by teaching).

For these reasons, finding and fixing flaws in a faulty computer program can be very different from thinking about how components in a causal chain, or sequences of programming instructions, could be revised. It is also utterly different from thinking about overlaps, subsumption relations, and transitions between "possible worlds", as would be required according to some philosophical theories of causation. [REF] A scientist or engineer can deal with possible variations in a tiny portion of this world, and make progress, rather than considering alternative complete worlds. (A more detailed criticism of possible world semantics may be added later in a separate document.)

Similar complexities occur in other forms of engineering and architectural design, where two objects that are superficially similar and are similarly located in a larger structure may instantiate different design patterns in some aspect of their structure or their mechanisms. (Some invasive organisms damage their hosts because of this.)

So perhaps these ideas about the design of complex information processing systems will turn out to be equally relevant to understanding the products of that great blind mathematician, architect and designer, biological evolution.

It's a great pity so few philosophers, psychologists and neuroscientists learn about the diversity of design methodologies, concepts and tools relevant to understanding information-processing systems in their degree courses. (Who will teach the teachers?)

NOTE:
I have a toy tutorial introduction to a subset of the ideas about "object oriented" programming here, for anyone who wishes to start learning more: http://www.cs.bham.ac.uk/research/projects/poplog/teach/oop

Transferring ideas from artificial to natural systems
Can what we've learnt about human-made systems be relevant to understanding naturally occurring systems? Compare, can what was learnt about about batteries, lamps, motors, resistors, capacitors, oscillators, etc. in laboratories be relevant to understanding electrical (and other) phenomena in organisms? I hope the answer is obvious to anyone reading this.

Humans have designed, built, debugged, extended, and combined many working systems using these (and other) concepts forming an ontology that is distinct from that of the physical sciences. Many of those systems work very well, even when they are far from perfect (like every widely used operating system!) and if humans can do something like that it's possible that biological evolution could also do something like that, only far, far, more complex, just as evolution produces the sorts of things mechanical engineers attempt to produce, but many of evolution's products are far superior in respect of strength-weight ratios, energy efficiency, costs of production, self-repair, etc.

[But we are closing the gap, slowly!]

If you ask a typical software engineer questions about computation, or how information-processing systems are designed, and why they work, and how they relate to their physical infrastructure, you will not necessarily get my sort of answer: most of them don't ask philosophical questions about what's going on when they do their jobs. But by joining them, asking philosophical questions and testing the answers against the details of the procedures, including their successes and failures (e.g. success in predicting that certain sorts of bugs will be very difficult to identify and remove!) one can acquire philosophical respect for the ontology and the theories used.

We don't have that kind of access to designers, builders, testers, fixers, of natural information processing systems (although good teachers, counsellors, etc. sometimes produce evidence that they understand the workings of minds of pupils or clients better than most of us do).

When I talk about possible explanations of various sorts of visual functions I think it's reasonable to base the ontology used on the guess that that approach can eventually provide rich and deep explanations of things we don't yet understand, and give us the ability to replicate some of the natural functions in human-made systems that we don't yet know how to build --

OR it could happen that serious flaws will later turn up in the ontology+theory and something will have to change, as has happened often enough in physics, chemistry, biology, ...

[ I suspect neuroscience is currently in a much more primitive state than most neuroscientists realise. That's partly because of the huge gaps between what needs to be explained and current theories about what brain mechanisms can do. Compare: one could be an expert particle physicist and know little about how the internet works. This is not unlike the gaps between AI expertise and what remains unexplained in the competences of humans and many other animals. ]

[Note:
I think claiming at this stage that the forms of computation (information processing) that can be implemented on Turing machines or networks of digital computers will suffice, is premature, though it may be true.

We know that chemical information processing is needed to build brains, and used in many bodily functions. It may turn out that we have to extend our ideas about computation (e.g. to include a mixture of discrete and continuous highly parallel, strongly interacting forms of information processing, which could be supported by chemistry, or other things -- Turing hinted at that sort of possibility in his 1950 paper, and his 1952 paper on the Chemical basis of morphogenesis gives an indication of how his thinking was developing).

I am collecting examples of human mathematical reasoning that may be relevant to that issue, e.g. mathematical discoveries about deformable shapes, like curves on a torus, which just don't fit any current AI theorem proving technique that I know of. Examples:

]

Implementation of virtual machinery:
Later on, a well developed theory about the virtual machinery used in the minds of humans and other animals could be extended to include a theory about how the virtual machines are implemented in the biochemistry etc. of the brain.

But I don't think that can be done without considering causal chains that go beyond the limits of the body, since part of the evidence supporting or challenging the theory will be concerned with how the organism copes with various environmental challenges. [Add reference to "loop-closing semantics"]

(Theories about sensory motor loops that don't refer to the environment, but only to patterns in sensory and motor signals, currently fashionable in some circles, will not cope with some of the rich and varied ways in which animals relate to things they are not sensing, and in some cases have not sensed, but enter into their theories about the world -- particle physicists and astrophysicists being extreme cases of such animals.)

In a way I am deploying some of the ideas of Imre Lakatos: We can't directly test complex theories (and the ontologies they use), but we can, over time, tell whether they support a progressive or a degenerating research programme. However, reversal is always possible, if something new turns up, as happened to Newtonian mechanics. (I don't agree with everything Lakatos wrote about this, including his [and Popper's] failure to discuss the central importance of discoveries about what is possible, theories explaining what is possible, and attempts to model what is possible. This was discussed in Chapter 2 of Sloman 1978 )

Philosophers often try to get a more "direct" justification for a theory of information content/processing -- in a way that I do not believe can work, any more than a direct justification for a theory about the mechanisms of genetic information transfer can work. When they can't find such a justification for a theory about mental mechanisms (or brain function) some of them assume that it's all physics (and chemistry) anyway, and try to show how to 'reduce' mental, and computational, phenomena to physical phenomena. However, when discussing information processing systems it is important to distinguish implementation from reduction. X can be fully implemented in Y without X being identical with Y.

In contrast with reductionists, Dennett takes information processing models seriously. He proposes adoption of an "Intentional Stance" (closely related in some ways to Newell's "knowledge level", which came later). This stance treats beliefs, desires, intentions, hopes fears, as if they had some kind of existence, while regarding them as "useful fictions" in a world that is really just physical. However, the stance works only on the assumption that the individuals to whom the mental states are attributed are rational. (That's a loose paraphrase, and does not reflect everything he writes. I think that in recent years he has sometimes sounded more like a realist about internal information processing in human minds.) This "stance" raises problems for attempts to develop information processing explanations of systems that are neither rational nor irrational, e.g. organisms that have desire-like and belief-like states, and use information in selecting behaviours, and in some cases can learn, but without being rational. They may simply have pre-compiled (inherited) strategies that work well enough, without the organisms having any knowledge of why they do what they do.

More about the toolkit mentioned above.
The toolkit I helped to build has no intelligence itself, because it was intended to support research testing different ideas about intelligence (unlike some AI toolkits, that include planners, or reasoners, or learning mechanisms, or visual subsystems, etc.). It had multiple sorts of functionality provided by different subsystems running in parallel (either on the same computer or distributed across different computers) and interacting with one another, in an environment that could be provided by either other mechanisms (e.g. another software package running on the same, or another machine, or a human interacting with the system via mouse and keyboard -- or cameras, microphones, and other sensors, though in fact such devices were not available to us at that time). So for some kinds of research the toolkit could be used to design and build a simulation of a (simple) form of physical environment and various kinds of more or less intelligent agents inhabiting that environment and interacting with one another.
For details (which don't really matter here) see:
http://www.cs.bham.ac.uk/research/projects/poplog/packages/simagent.html

The aim was to support an extended programme of research producing more and more complex running systems, testing theories about human and other sorts of cognition, perception, control of action, communication, etc. For various reasons, including my unwillingness to jump through required fund-raising hoops and the extreme difficulty of hiring people with the kind of broad and deep education (including philosophy) that I needed, I switched to working as a theorist, on the periphery of robot projects that developed 'mainstream' ideas. For the last five years I've detached myself from projects that require me to jump through report-writing hoops etc. and continued trying to develop the required theories.

In that context a lot of philosophical discussion about the nature of computation, what computational models are, what information is, what information processing is, what the relations between virtual machines and physical machines are don't get to grips with the actual complexity and variety of the phenomena.

For example, very different things need to be said about

-- an adder implemented in hardware (or firmware),

-- a parser implemented in software, or a planner, a theorem prover, a chess program,

-- an email system on the internet, able to interact (indirectly) with remote email systems,

-- (deep breath) a system whose architecture includes several perceptual subsystems connected directly, or (more usually) indirectly, with physical sensors, motor control subsystems connected directly or indirectly with physical effectors (wheels, grippers, legs), Gibsonian perceptual systems including both sensory and motor components (e.g. a movable camera, a movable hand with sensor-packed skin), learning mechanisms of various sorts, motive generation and management systems of various sorts, interrupt generators and handlers of various sorts, 'introspective' mechanisms of various sorts, e.g. some that can monitor and modify a planning process, others that can inspect intermediate structures in a multi-layer visual system (the source of visual qualia), mechanisms for extending the ontology used, the forms of representation used and the theories used, the reasoning mechanisms used, mechanisms for modifying and extending the motivational, interrupt, evaluation and control mechanisms e.g. to cope with more and more complex problems involving more and more of the environment, including more and more other agents,... and, in some machines, mechanisms that allow a genome to unravel slowly over time with different processes of instantiation at different stages of development, deeply influenced by what has been learnt and developed so far (as seems to be required for a new born baby to have the potential to learn any one of several thousand different human languages). (So-called altricial species illustrate this, though Jackie Chappell and I have argued that instead of distinguishing precocial and altricial species, we need to distinguish different competences within an individual. We started calling the competences precocial and altricial and switched to pre-configured and meta-configured.)

It is impossible to implement such a system directly in physical hardware (of any sort known to me) for various reasons.

It has taken over half a century of hard-won knowledge and expertise in many sub-fields of computer science to get to the point at which we can think about how such systems can be *indirectly* implemented in physical hardware, using multiple layers of implementation (hardware, firmware, software, memory-management systems, schedulers, interrupt hardware and software, operating systems, language platforms, multi-component packages, networking protocols, distributed architectures...). Some of the important philosophical points arising are summarised in this overview of virtual machine functionalism, and virtual machine supervenience.

I am not trying to 'blind with science'. (a) I think biological evolution somehow blindly but effectively managed to produce similar 'discoveries' and 'inventions' and combine them to produce hardware and software systems whose subtlety and complexity still defeats human attempts either to replicate or to understand what's going on, except in a tiny (but steadily growing) subset of cases. But we have very little knowledge of the intermediate cases, and what knowledge there is is mostly about physical form, physical environment and behaviour, not information processing mechanisms: they don't leave fossil records, except very indirectly. Before the advent of computer systems engineering and science we lacked most of the concepts required to formulate good theories to explain what evolution did. We may still have too many gaps in our theory creation abilities.

(b) I don't think it's useful to try to define "information": all such attempts lead to triviality, or circularity or errors of one sort or another, e.g. leaving out some of phenomena that should be included. Instead we need a theory about information, that explains how energy, matter, and information play different, but interacting roles, in our universe. The theory should cover many different cases and explain their differences and how the different sorts of information interact with other things, e.g. energy, matter, and various kinds of machine involved in controlling something, taking decisions, making something happen, designing, reasoning, discovering, communication. Not all cases of information processing include all of the above. In particular there are different varieties of IP to be found at different stages in biological evolution. Exploring that variety is the aim of the Meta-Morphogenesis project.


REFERENCES AND LINKS


Maintained by Aaron Sloman
School of Computer Science
The University of Birmingham




































--