Aaron Sloman
http://www.cs.bham.ac.uk/~axs
School of Computer Science, The University of Birmingham, UK
Installed: 22 Jan 2011
Last updated: 7 Oct 2018; Minor clarifications 26 Feb 2019
17 Jul 2018;
24 Jan 2011; Reformatted May 2015; minor changes Apr 2016;
17 Jul 2018 Major revision and change of title: in response to comments from Olivier Marteaux, pointing out that I had misunderstood/misrepresented Bateson below.
http://www.cs.bham.ac.uk/research/projects/cogaff/09.html#905
Aaron Sloman,
What's information, for an organism or intelligent machine?
How can a machine or organism mean?,
in Information and Computation,
Eds. Gordana Dodig-Crnkovic and Mark Burgin,
World Scientific Publishers, 2011, New Jersey, pp 393--438
This document is also closely related to my endorsement of Jane Austen's theory of information, contrasted with Claude Shannon's theory: in Sloman-Austen, discussed briefly below.
These ideas are central to the Turing-inspired Meta-Morphogenesis project, later
sub-titled
"The self-informing universe project".
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
(pdf)
This alleged definition is often quoted with approval by thinkers of different backgrounds, as can be seen by searching for occurrences of the phrase "a difference that makes a difference" in conjunction with "information". Sometimes the definition is attributed to others, presumably because they have quoted or used it.
Obviously the phrase "a difference that makes a difference" resonates powerfully with many people (even a philosopher as clever as Daniel Dennett (Edge,2017)).
Perhaps this is because it is a pointer to a very common and important kind of complexity, in which systems are composed of linked, tightly coupled, sub-systems whose causal relationships have the property that any event in one sub-system (e.g. some property, value or relationship changing, or a part being added or removed) has effects in other subsystems, or possibly ripples of effects spreading out through the whole system.
Examples include a pebble hitting the previously flat surface of a pond, a fly wriggling in a spider-web, the speed of rotation of a cog wheel in some machine changing because of a change in friction, pushing a button causing an electric circuit to be closed triggering a wave of activation through a collection of interacting electronic and mechanical devices, an army being galvanised into battle by a light signal flashed on a hill-top, a news item causing share-prices all round the world to begin to fall, or a rumour spreading quickly through a community and causing a mob to attack a building.
In all those cases it is reasonable to say that some information flows through a more or less complex system, triggered by the initial change (a difference occurring) and that the intermediate stages of such propagation depend on new intermediate changes/differences producing new effects elsewhere (including positive and negative feedback loops in some systems). However, this ignores cases where unchanging information (e.g. information about pressure, temperature, rotational speed, voltage, etc.) is constantly transmitted and displayed or recorded somewhere, e.g. on an operator's control panel. Was Bateson ignorant of such cases, or only temporarily forgetful? Or was he merely making a point about special cases where changing information is important?
I don't know whether Bateson (or his any of his admirers) noticed that it is also possible for a change or difference that is not temporal but spatial to have effects. For example, a geologist surveying some terrain may notice a transition across a boundary, which suggests the possibility of some desirable material or substance being available somewhere underground on one side of the boundary. Alternatively a farmer who notices a boundary separating two kinds of soil may be caused to sow a certain crop only on one side of the boundary. In these cases, the static spatial change or difference produces temporal changes as a result of the occurrence of detection or observation of the static change: i.e. the trigger may be temporal, even though what is triggered depends on something non-temporal. In such a case, the original difference need not actually make a difference to anything: whether it does or not will depend on something else: a happening triggered by detection of the spatial difference.
Bateson could deal with that quibble by replacing "A difference that makes a difference" with "A difference that can make a difference". I'll return to this below. The potential difference-triggers in a situation may be as important as the actual triggers, like the "No trespassers" sign that has an informing function whether there are readers present or not. (It could be responsible for the absence of readers if it is an old sign!)
Bateson described not "information" but "a bit of information" and later "the elementary unit of information" as "a difference that makes a difference".
He did this in at least two of the essays, namely in "The Cybernetics of 'Self': A Theory of Alcoholism" and in "Form Substance and Difference".I conclude that insofar as Bateson's remark is widely interpreted by uncritical readers as being a definition, or a general truth about information, it is a misinterpretation because Bateson was referring only to the special case of a discrete change. He was talking about a bit or unit of information, not attempting to define information in general, unless he really made the mistake of thinking of all information items as "differences" that are propagated along information channels.Notice that there is a difference between attempting to define (or say something definitive about) the word "information" and attempting to do it for more complex phrases like "a bit of information" and "the elementary unit of information", which he seems to take as different labels for the same thing, which he describes as "a difference that makes a difference". Similar or equivalent wording, with "information" always qualified as illustrated here occurs in several places in the book.
In all the contexts that I found, he was NOT talking about, or defining, information in general but about an ITEM or UNIT or PIECE of information as a difference that makes a difference.
So it looks as if he accepted the assumption that information increments (or decrements) must be discontinuous, and that there is a minimal discontinuity -- one of the interpretations suggested above.
Given the widespread application of ideas from cybernetics and control engineering, making use of continuous changes, often expressed using differential equations, as Bateson must have known, it is very unlikely that he assumed that information must always be discrete, as the phrase "a difference that makes a difference" is often taken to imply.
So taking the phrase as referring to the special case of an event involving a step change in information leads to a plausible interpretation of the quotation, not as a definition of information, but an observation about an important subset of cases where some change has effects that are propagated and later interpreted as conveying information.
His slogan certainly does not cover all occurrences and uses of information in human life, in animal brains or in computers. For example, information items can be stored for long periods without being used, i.e. without "making a difference", except perhaps in the far-fetched sense that having information available in case it is needed can "make a difference" to the robustness, or reliability, of some information using animal or machine. But that making a difference is not an event or a process, but a static enduring state.
At best Bateson's slogan seems to be a useful first approximation to a characterisation of the role of a bearer of information, where the information itself could be expressed or carried by alternative structures. Information is usually not essentially linked to a unique mode of expression, since different bearers for the same information content may be preferable in different contexts.
However, Bateson's phrase is applicable only to very simple information items. The phrase "a difference that makes a difference", or "a bit", is definitely not appropriate for the information content of something complex, like this sentence, or the information content of Euclid's theorem that there are infinitely many prime numbers. (A more complete discussion would need to compare the information content of a theorem and the information content of a particular proof of the theorem.)
Euclid's theorem is a very important item of information that has had enormous influence in mathematics and its applications, especially in connection with recent privacy and security techniques making use of large prime numbers. Euclid's proof certainly made a difference (a huge difference) to mathematics, science and engineering. But thinking of the proof, or the theorem, or its information content, as merely a difference of some kind is a serious mistake. It has deep mathematical content that is independent of whether it is ever put to any practical use.
It is worth quoting the full sentence, which refers to energy, and the following discussion:
"What we mean by information - the elementary unit of information - is a difference which makes a difference, and it is able to make a difference because the neural pathways along which it travels and is continuously transformed are themselves provided with energy."
and later
"But what is a difference? A difference is a very peculiar and obscure concept. It is certainly not a thing or an event. This piece of paper is different from the wood of this lectern. There are many differences between them-of color, texture, shape, etc. But if we start to ask about the localization of those differences, we get into trouble. Obviously the difference between the paper and the wood is not in the paper; it is obviously not in the wood; it is obviously not in the space between them, and it is obviously not in the time between them. (Difference which occurs across time is what we call 'change.')A difference, then, is an abstract matter."
He then goes on to point out some differences between the subject matter of "hard sciences" such as physics and the study of minds, or information-using systems:
"In the hard sciences, effects are, in general, caused by rather concrete conditions or events-impacts, forces, and so forth. But when you enter the world of communication, organization, etc., you leave behind that whole world in which effects are brought about by forces and impacts and energy exchange. You enter a world in which 'effects'--and I am not sure one should still use the same word--are brought about by differences. That is, they are brought about by the sort of 'thing' that gets onto the map from the territory. This is difference.Note:Difference travels from the wood and paper into my retina. It then gets picked up and worked on by this fancy piece of computing machinery in my head.
The whole energy relation is different. In the world of mind, nothing--that which is not--can be a cause. In the hard sciences, we ask for causes and we expect them to exist and be 'real.' But remember that zero is different from one, and because zero is different from one, zero can be a cause in the psychological world, the world of communication. The letter which you do not write can get an angry reply;"
Of course, if the non-receipt of a promised or legally required letter is noticed, that can trigger an angry letter, which is a response to inaction, not a response to an unwritten and therefore non-existent letter. A similar comment applies to the income tax form in the next Bateson quotation. The empty form in your desk does not trigger any action. An official noticing the non-receipt of the form at the Inland Revenue office can trigger action. So the next portion of what Bateson wrote strictly mis-uses the word "trigger":
"... and the income tax form which you do not fill in can trigger the Internal Revenue boys into energetic action, because they, too, have their breakfast, lunch, tea, and dinner and can react with energy which they derive from their metabolism. The letter which never existed is no source of energy.
A difference, then, is an abstract matter.
In the hard sciences, effects are, in general, caused by rather concrete conditions or events-impacts, forces, and so forth. But when you enter the world of communication, organization, etc., you leave behind that whole world in which effects are brought about by forces and impacts and energy exchange. You enter a world in which 'effects'-and I am not sure one should still use the same word-are brought about by differences. That is, they are brought about by the sort of 'thing' that gets onto the map from the territory. This is difference.
Difference travels from the wood and paper into my retina. It then gets picked up and worked on by this fancy piece of computing machinery in my head."
Bateson's main aim here is clearly not to offer a new definition of "information" but to characterise some general features of (a subset of?) control mechanisms in which both information and expectation of information that does not arrive, can have causal roles.
Notice the second sentence, below: Bateson seems to be trying to characterise differences between the causal roles of information and the causal roles of physical entities and their properties.
"The whole energy relation is different. In the world of mind, nothing--that which is not--can be a cause. In the hard sciences, we ask for causes and we expect them to exist and be 'real'. But remember that zero is different from one, and because zero is different from one, zero can be a cause in the psychological world, the world of communication. The letter which you do not write can get an angry reply; and the income tax form which you do not fill in can trigger the Internal Revenue boys into energetic action, because they, too, have their breakfast, lunch, tea, and dinner and can react with energy which they derive from their metabolism. The letter which never existed is no source of energy."
He continues...
"It follows, of course, that we must change our whole way of thinking about mental and communicational processes. The ordinary analogies of energy theory which people borrow from the hard sciences to provide a conceptual frame upon which they try to build theories about psychology and behavior-that entire Procrustean structure-is non-sense. It is in error.I suggest to you, now, that the word 'idea', in its most elementary sense, is synonymous with 'difference'."
I wonder why nobody has cited Bateson as defining "idea" as synonymous with "difference"!
I wonder how many of the people who approvingly quote Bateson as defining "information" as "a difference that makes a difference" agree with all of the above. My own paraphrase would be:
Matter and energy can both be used and can have effects. Information can also be used and can have effects, but information is not something spatio-temporally located, like portions of matter, energy and force (transfer of energy). Information is something more abstract, more concerned with the possibility of making of choices between alternatives. However, information can be recorded in, or transmitted using, physical devices. When that happens the subsequent effects of storing or transmitting the physical item go beyond the purely physical effects, if the item is made available to an appropriate information user. Information can also be provided by events or states of affairs where there is no intention to communicate: a flash of lightning, a footprint left in mud, the non-consumption of food, arrival of an energy pulse from a remote galaxy can all be sources of information for intelligent information users with appropriate biological or artificial sensory apparatus.
In the above quotations, there is only the weakest of indications that everything Bateson says about what information is and is not, and what it can do, presupposes the possibility of a user of the information, through whom or through which the information can make a difference. I don't know whether Bateson simply assumed the existence of users. Note that some users are inanimate: a thermostatic control device or a burglar alarm may detect and use information.
He knew that there was an older, deeper notion of information, but unfortunately somehow (unintentionally) persuaded many thinkers to ignore it.
The older idea
was used by Jane Austen in her novels, e.g. Pride and Prejudice. The
difference between Austen-information and Shannon-information is discussed here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/austen-info.html
(pdf)
Jane Austen's concept of information (Not Claude Shannon's)
Implicit definition of deep and complex concepts is the only possibility for many scientific concepts, including "matter" and "energy" -- which is why "symbol-grounding" theory (another name for "concept empiricism"), is false, as explained in this presentation.
The "What's information" paper attempts to present substantial portions of such a theory, though the task is not completed. In particular section 3.2 explains how theories can implicitly define the concepts they use and relates this to defining "information".
More specifically, what it means for B to express I for U in context C cannot be given any simple definition, in part because it is a generic polymorphic concept, which can be instantiated in different ways in different contexts. However, as mentioned above, a partial theory is implicit in Jane Austen's uses of the word "information" in her novels.
Barbara Webb, Transformation, encoding and representation, in Current Biology, 16, 6, pp. R184--R185, 2006, doi:10.1088/1741-2560/3/3/R01
Information about X is normally used for quite different purposes from the purposes for which X is used. For example, the information can be used for drawing inferences, specifying something to be prevented, or constructed, and many more. Information about a possible disaster can be very useful and therefore desirable, unlike the disaster itself.
So the notion of standing for, or standing in for is the wrong notion to use to explain information content. It is a very bad metaphor (based on some person or object taking the place of another in some process or situation), even though its use is very common.
We can make more progress by considering ways in which information can be used. If I give you the information that wet weather is approaching, you cannot use the information to wet anything. But you can use it to decide to take an umbrella when you go out, or, if you are a farmer you may use it as a reason for accelerating harvesting. The falling rain cannot so be used: by the time the rain is available it is too late to save the crops.
The same information can be used in different ways in different contexts or at different times. The relationship between information content and information use is not a simple one.
A partial answer might be that we need two distinct concepts in order to avoid the circularity that would be manifest in attempting to define "difference" as "a difference that makes a difference", or "change" as "a change that causes changes". By having two words, one being defined and one used in the definition we avoid circularity. But what has been achieved?
In the cases where changes are propagated through a connected system, something may use detected changes as bearers of information about something else, but that does not make the changes themselves the information.
And lurking in the background to all these questions is the problem that "a difference" suggests something discrete: a step-change, as does Bateson's use (echoing Shannon) of the phrases "bit of information" and "elementary unit of information", suggesting that information is built out of indivisible chunks that are combined to create larger items. This does not square well with the common sense idea that information can be about things that vary continuously, such as pressure, or distance, or speed, or direction, or closeness to danger. In these cases there is no smallest information difference between two states, such as two possible velocities or locations for the same object. (Some quantum physicists might disagree: but our ordinary concepts of information, or meaning, don't presuppose that the physical universe is discrete, even if it actually is.)
All of that might be a way of defending Bateson's talk of bits or
elementary units but I don't know whether he ever wrote something
with precisely that interpretation.
(Comments from Bateson scholars welcome.)
Bateson's summary focused on the fact that in many contexts information is carried by some change, e.g. in a spatial structure or temporal pattern which is not always physical. Perhaps another way of expressing that would be to say that every use of information must involve an explicit or implicit comparison, e.g. between two or more available options, or between how things are and how they might have been or were previously, or could be in future.
I now feel that by homing in on what seemed to him to be a crucial common factor, namely some difference that has implications for some agent or decision maker, and leaving out the required context, i.e. what an information user is, he abstracted a step too far, and as a result created a powerful, but seriously misleading meme: "Information is a difference that makes a difference" that has crawled around many brains without including the rich context of Bateson's thought.
Long before there were computing machines as we now think of them, many ingenious human designers built machines that behaved in accordance with either pre-stored information (e.g. music boxes), or constantly changing information (e.g. fan-tail windmills, the Watt governor, and many more).
Moreover, long before humans used information, or created information-using
machines, biological evolution was both using information and creating ever more
sophisticated varieties of information user and information. The study of those
evolutionary processes is what I have been calling "The Meta-Morphogenesis
project", now also referred to as "The self-informing universe project"
Sloman(M-M)
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
Although Jane Austen had nothing to say about biological evolution, as far as I know, she had some deep insights into the information using capabilities of some of the most recent products of biological evolution. I wonder whether Shannon ever read Pride and Prejudice, and, if not, what difference it would have made if he had.
(Is that Bateson's ghost grinning at me???)
https://www.edge.org/conversation/daniel_c_dennett-a-difference-that-makes-a-difference
EDGE.ORG: "A Difference That Makes a Difference"
A Conversation With Daniel C. Dennett [11.22.17]
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/austen-info.html
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/austen-info.pdf
Jane Austen's concept of information (Not Claude Shannon's)
Online technical report, University of Birmingham, 2013--2018:
Latest version of the Meta-Morphogenesis project (still being extended).
http://www.cs.bham.ac.uk/research/projects/cogaff/11.html#1106d
Sloman, A. (2013). Virtual machinery and evolution of mind (part 3)
meta-morphogenesis: Evolution of information-processing machinery.
(The original proposal for the M-M project.)
In S. B. Cooper & J. van Leeuwen (Eds.),
Alan Turing - His Work and Impact (p. 849-856)
Amsterdam: Elsevier.
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
(pdf version)
This is the location of the current (latest) version of the overview of the
Meta-Morphogenesis project. Its contents change from time to time.
Maintained by
Aaron Sloman
School of Computer Science
The University of Birmingham