NB italics removed from this for online reading.                  Page 1
PDF and Postscript versions available here:
    http://www.cs.bham.ac.uk/research/projects/cogaff/81-95.html#43

                                   An earlier version appeared in:
                                   The analysis of Meaning:
                                           Informatics 5
                                   Proceedings ASLIB/BCS conference
                                   Oxford, March 1979,
                                   Eds: M.MacCafferty and K.Gray,
                                   Published by Aslib.


               THE PRIMACY OF NON-COMMUNICATIVE LANGUAGE

                       School of Computer Science
                      The University of Birmingham
                               Birmingham
                            B15 2TT, England
                     http://www.cs.bham.ac.uk/~axs/

          (Written in 1979 while at The University of Sussex)


Introduction
------------

     How is it possible for symbols to be used to refer to  or  describe
things?  I  shall  approach  this  question  indirectly by criticising a
collection of widely held views of which the central one is that meaning
is  essentially concerned with communication. A consequence of this view
is that anything which could be reasonably described as  a  language  is
essentially  concerned  with  communication.  I  shall  try to show that
widely known facts, for instance facts about the behaviour  of  animals,
and  facts  about  human  language  learning  and use, suggest that this
belief, and closely related assumptions (see A1 to A3, below) are false.
Support  for  an  alternative  framework  of assumptions is beginning to
emerge from work in Artificial Intelligence,  work  concerned  not  only
with  language  but  also with perception, learning, problem-solving and
other mental processes. The subject has not yet matured sufficiently for
the  new paradigm to be clearly articulated. The aim of this paper is to
help to formulate a new framework  of  assumptions,  synthesising  ideas
from Artificial Intelligence and Philosophy of Science and Mathematics.

     The rival frameworks can be briefly over-stated thus:
OLD:  A language is essentially a *social* phenomenon  and  meanings  are
     essentially things to be communicated, so that it is impossible for
     anything to use  a  language  solely  for  private  purposes: *the
     primacy of communication*.
NEW: the essence of language is  storage  of  information  for  use  and
     manipulation  by  an  individual, and communicative potential is an
     evolutionary  side-effect  of  this  function: *the  primacy of
     representation*.
A very clear formulation of the first thesis can be found in  chapter  2
of  Lyons  (1977), and a not so clear but very influential discussion in
Wittgenstein (1953). John  Lyons  has  drawn  my  attention  to  Chomsky
(1975),  which  criticises  versions  of  the first thesis propounded by
Searle,  Grice  and  Strawson.  The  work  of  formal  semanticists  and
mathematical  linguists is usually neutral on this issue, since they are
not  concerned  to  explain  how  it  is  possible   to   use   language


                                  -1-


                                                                  Page 2


meaningfully,  but  merely  explore  consequences  of formal assumptions
about meaning. Something like the second thesis is implicit or  explicit
in  a great deal of work in Artificial Intelligence over the last twenty
years or so (for surveys see Boden 1977, Winston  1977)  and  a  related
version  is expounded at length in Fodor (1976). My own (1978) takes the
second thesis for granted throughout. As should become clear later,  the
second  thesis  does  not  deny  that there are languages (e.g. English,
French...) which are used largely for communication, nor  that  many  of
their  main  features  derive  from  this use. The claim is that use for
communication with other individuals is not  a  necessary  pre-condition
for the meaningful use of some language.

The standard framework
----------------------

     The old thesis is part of a collection of widely  held  assumptions
which I shall challenge.  Here is a summary of a central subset.

A1. The  primary  function  of   language   is   communication   between
    individuals  (e.g.  Lyons  writes:  'it  is difficult to imagine any
    satisfactory  definition  of  the  term  'language'  that  did   not
    incorporate  some  reference to the notion of communication' (1977),
    page 32, and goes on  to  state  that  it  is  obvious  that  it  is
    impossible to account for meaning except in terms of communication).
    So language is essentially social.

A2. Learning a public, shared, language is  a  pre-condition  of  having
    knowledge,   beliefs,   intentions,  principles,  and  of  thinking,
    deciding or inferring.

A3. Human beings are the only animals which  use  language  to  describe
    things and reason with.

     These assumptions are not necessarily all held simultaneously,  for
they  are  independent  of  one another (though I shall not try to prove
that now). But they are often held together. I shall argue against  them
all,  trying  in particular to show that there are at least three senses
in which the use of a rich and  powerful  internal  language  within  an
individual  is  prior  to  the  use  of  language in overt communication
between individuals.  I then sketch a theory of how it is  possible  for
an  internal  language  to  be used to refer to and describe an external
world.

     I do not claim that there is any *one* inner language common to all
animals  or  even  all  human beings (compare Fodor 1976), for the inner
language or languages of any one  individual  would  develop  under  the
influence of that individual's unique history, including possibly events
prior to birth and certainly after exposure to a public language. It  is
a  consequence  of  the  theory  sketched  below  that a language may be
extended by the addition of new symbols not  defined  in  terms  of  the
previously  known  symbols.  It  follows  that some learning of an overt
language may extend an individual's inner language, rather  than  always
simply  relying  on  a  fixed  inner language to define new symbols.  It
could even be argued that the evolution of social systems, using  shared
overt   languages,  would,  through  natural  selection,  influence  the
genetically  determined  internal  linguistic  abilities,  for  instance
allowing  them  to  be  more  open  to  external  influence, and thereby


                                  -2-


                                                                  Page 3


facilitating cultural evolution, which permits more rapid adaptation  to
changing  circumstances  than  Darwinian  evolution.   Thus  some of the
innate linguistic abilities of a social animal might be  geared  to  the
communicative  function  of language.  But I shall try to show that this
is not a necessary condition for having linguistic abilities.

What is a language?
------------------

     Of course, one can easily define "language" in such  a  way  as  to
restrict  language  to overt communication between individuals, and that
would make A1, above, true, but true by definition and misleading, as  I
shall  show.  In any case, to talk of individuals communicating by means
of symbols implies that they can understand the symbols, i.e.  interpret
them as meaningful, and how this is possible cannot be explained without
reference to internal processes.   Stipulative  definitions  don't  help
with this.

     There is a clear and important sense of the word  "language"  which
does  not make A1 true by definition -- and in this interpretation A1 is
still widely  assumed  to  be  true.  What  is  this  broader  sense  of
"language"? I shall give only an incomplete  answer. I believe that most
students of language would, at least after some reflection,  accept  the
following as *necessary* conditions for saying that X uses a language L,
even if they are not *sufficient* conditions. (The first three conditions
derive from the work of Frege, but are now widely accepted.)

L1. L includes  both  simple  and  complex  symbols,  the  latter  being
    composed  of  the  former, in a principled fashion. (Symbols are any
    kind  of   entity   used   in   constructing   maps,   descriptions,
    representations  etc.  Non-denoting  symbols,  like  parentheses and
    other syntactic devices may be included.)
L2. There is at any one time a definite set of simple symbols of L known
    to  X (or usable by X), although this set may be enlarged over time,
    and some of the symbols may fall out of use.
L3. There is at any one time a restricted set of modes of composition of
    more  complex symbols of L from less complex ones known to X, though
    the set of rules of composition (grammatical rules) may change,  and
    need  not  be explicitly formulated in X or anywhere else: e.g. they
    may be consequences of other features of the procedures employed  by
    X for using the symbols.
L4. X does not merely construct or contemplate such  symbols,  but  uses
    them  at  least  (a) to express beliefs and possible beliefs i.e. to
    represent what is or may be the case (b) to formulate questions i.e.
    to specify missing information (c) to formulate goals or purposes or
    intentions, or instructions. We need not assume that these different
    uses  correspond  to  different subsets of L. For instance it may be
    that which of these uses X makes of a particular symbol varies  from
    context to context, or even that no symbol is ever used with exactly
    one of these functions.


     I shall not now attempt to define precisely what is meant  by  such
words as "symbol", "rule", "facts", "questions","goals", "instructions",
etc., or to go into  all  the  many  and  subtle  distinctions  made  by
logicians  and  linguists and discussed at length in Lyons 1977 Vol 1. I
assume everyone has at least a rough and ready grasp of L1  to  L4,  and


                                  -3-


                                                                  Page 4


can  make  some sense of the distinctions in L4 between using symbols to
record what is the case, using them to specify gaps in what is  recorded
as  being  the  case, and using them to generate behaviour by describing
the behaviour. (How these things can be done is another matter.) Partial
analyses  will  be  offered  later. It is not implied by L1 to L4 that X
need be conscious of using L.

     L1 to L4 do not say anything explicitly about communication between
individuals.  So  it  is not obviously true by definition that something
which is a  language  in  the  sense  implicitly  defined  by  L1-L4  is
primarily   used,  or  even  used  at  all,  for  communication  between
individuals. X might use L entirely in a private diary, or in  its  mind
only,  as far as L1 to L4 are concerned. For instance, X may formulate a
question specifying missing information in the course of constructing  a
plan   --   the   question  may  be  used  to  generate  inferences  and
information-gathering processes. It need not  be  addressed  to  another
individual.  Similarly,  instructions  may  be  part of a stored plan or
strategy used by X.

     Nevertheless, I think it is widely held  even  if  only  implicitly
that  in some sense the main or primary use of anything which would be a
language in the sense of L1-L4 must necessarily be  overt  communication
between  individuals.   Other,  more  private,  uses  would  have  to be
derivative,  in  some  sense.  The  most  powerful  exponent   of   this
essentially  public  view  of language was Wittgenstein (1953). However,
related, though less sophisticated, views are  quite  common.   What  is
wrong  with  this  cluster  of  views?  Once  again  my approach will be
indirect.

Intelligence in lesser mortals
------------------------------

     Have you ever wondered how it is  possible  for  animals  to  learn
circus  tricks?   Or  how  birds  manage  to  build  nests? Or how it is
possible for  monkeys  to  leap  through  trees  at  high-speed  without
frequently crashing into branches or missing them completely and falling
to the ground? Or how hunting animals  find  their  way  back  to  their
lairs? Or how frogs, flies, and other apparently stupid animals are able
to manoeuvre themselves into the  right  position  to  mate  with  other
individuals?  Or  how  a new-born deer can run after its mother?  Or how
spiders manage  to  make  their  webs  in  a  variety  of  geometrically
different physical situations, and to patch them when they are damaged?

     Of course, it is possible to assume that such things  just  happen,
that  they are "natural", that no explanation is required, just as it is
possible not to be puzzled about the fact that unsupported  apples  move
towards not away from the earth, or the fact that frogs' eggs eventually
grow into frogs and not  fish.  This  flabby  acceptance  of  facts  may
suffice  for  phenomenologically  minded philosophers and the man in the
street, but  if  one  wishes  to  understand  the  possibility  of  such
phenomena  it  is  necessary  to  attempt  to  construct  theories about
*underlying mechanisms*.

     At present no adequate theories  about  underlying  mechanisms  are
available for the abilities listed in the previous paragraph.  There are
many theory-building tools, including concepts of physics and  analogies
with  physical  processes,  concepts  of  neuro-physiology, concepts and


                                  -4-


                                                                  Page 5


formalisms of  control-theory  and  systems-theory,  and  most  recently
concepts   and   formalisms   of   computing   science   and  artificial
intelligence. The latter are concerned with  mechanisms  which  generate
processes  in  which  symbols  are  constructed  and manipulated. At the
moment it looks as if  only  this  last  set  of *computational* theory-
building  tools has any hope of being useful for building theories about
mechanisms which could generate both the variety and the  fine-structure
of the intelligent behaviour of animals.  No non-computational mechanism
currently known is capable  of  generating  a  range  of  qualitiatively
different  patterns of behaviour intricately related and adapted to both
the sensed structure of  the  environment  and  to  pre-existing  goals.
What, then, is a computational mechanism?

The general form of Artificial Intelligence Theories
----------------------------------------------------

     Water running  down  a  hill  will,  to  a  certain  extent,  avoid
obstacles.  But  there  is  no  need  to  assume that it has the goal of
getting to the bottom of the hill, or that any intelligence is  involved
in  generating  its behaviour.  Each portion of water merely responds in
accordance  with  relatively  simple  mechanical  principles  to *local*
conditions, and the  overall  behaviour is simply the *sum* of all these
local processes. Thus something like  the  mathematics  of  differential
equations  and  boundary  conditions,  possibly  enhanced by singularity
theory, suffices to represent and explain what is going on - even though
in  many  cases  measuring  the  boundary  conditions  and  solving  the
equations may present very great technical difficulties.

     In particular, there is no need to assume that the water makes  use
of  a  representation  of the current situation which is compared with a
representation of a goal situation,  or  that  the  results  of  such  a
comparison  lead to the selection or construction of some strategy whose
execution requires the collaboration of sub-systems which are under  the
control  of  a  central  executive. Processes like these would require a
computational, i.e. symbol manipulating, mechanism.

     By contrast, theories in A.I. are  concerned with mechanisms  which
build,  compare,  manipulate,  search  for,  interpret, analyse, or obey
symbolic structures of some  kind.   The  existing  theories  have  many
limitations,  such  as  a lack of parallelism, a restriction to discrete
(digital) symbolisms, and, above all, a very small amount of information
(compared  with  what  a  human  or  animal  brain seems able to store).
Moreover, AI programs so far have  had  a  very  simple  structure:  for
example  there  are  none which could be described as even approximately
like a complete organism  with  its  own  system  of  goals,  perceptual
abilities, planning abilities, and learning abilities.  However, some of
these restrictions are beginning to be  overcome,  and  others  probably
will be, as far as can be judged at present.

Types of symbol manipulating mechanism
--------------------------------------

     What I am claiming  then  is  that  the  only  paradigm  of  theory
construction  which  looks  remotely like being able to provide theories
accounting for much animal  behaviour  is  the  computational  paradigm,
which   describes   mechanisms   using   internal  symbolisms.   We  can
distinguish different  degrees  and  kinds  of  sophistication  in  such


                                  -5-


                                                                  Page 6


mechanisms.  The following is but a short list of examples:
1.  A single branch-free program is used, which, once triggered,  always
    causes essentially the same sequence of instructions to be obeyed.

2.  Whilst a program  is  being  obeyed,  sense-organs  are  continually
    updating  some  symbol  store,  and  at  certain  points conditional
    instructions generate  behaviour  which  depends  on  this  incoming
    information,   e.g.   adjusting   muscular   exertion  to  the  wind
    resistance.

3.  As in the previous case, except that  incoming  information  affects
    not  just  local  behaviour (i.e. what is done at particular steps),
    but the global flow of control, as in a program which under  certain
    conditions will transfer control to a quite different program e.g. a
    switch from food-gathering behaviour to escaping behaviour triggered
    by the smell of a predator, for instance.

4.  Alongside other behaviour there may be  a  process  of  analysis  of
    incoming  information, producing an internal symbolic representation
    of the current environment, whether or not it is relevant to current
    needs  and  strategies:   e.g.  the  construction of descriptions of
    relatively static three dimensional objects and relationships on the
    basis of continually varying two-dimensional retinal information, or
    the construction of some kind of map of the environment on the basis
    of  exploring  a  sequence of routes through it.  (Must one of these
    evolve before the other?  Both  require  the  ability  to  represent
    spatial information.)

5.  Instead of using a permanent set of stored programs the organism may
    modify  its  programs,  or  synthesise  new ones, in the light of an
    analysis of the short-comings of the  old  ones.   This  presupposes
    internal  descriptions  of  some  of  the programs and of both their
    intended and their actual effects.

6.  Instead of having a permanent set of stored  procedures  for  making
    major  decisions,  on the basis of available information, the system
    may include procedures for modifying its decision-making strategies,
    including  both  the alteration of relative weightings of previously
    used criteria, and the synthesis of new principles and policies.

     This is not meant to be anything like an exhaustive survey of types
of  symbol-using  mechanisms  which  might be offered as explanations of
increasingly sophisticated patterns of animal behaviour.   The  examples
do,  however,  illustrate  a number of dimensions in which computational
systems can vary, namely:

A:  the extent to which decisions (including decisions about how to make
    decisions, etc.) are postponed till "run-time"

B:  the extent to which information is taken in and stored  in  case  it
    may be useful, reducing the reliance on the immediate environment to
    provide information for decision-making processes.

C:  the extent to which the  system  alters,  or  synthesises,  its  own
    programs.

D:  the extent to which different activities can go on in parallel, e.g.


                                  -6-


                                                                  Page 7


    performing   actions,   monitoring  their  effects,  taking  in  new
    information, reconsidering goals and plans, etc.

     Variations in these dimensions  would  account  for  variations  in
degrees  of  flexibility,  generality  of learning abilities, ability to
solve problems, ability to adapt to changing  circumstances,  etc.   The
evolution of consciousness is probably connected with diversification of
functions alluded to in D.

     Whether or not mechanisms of the general forms  sketched  above  do
underlie the intelligent behaviour of animals, it is at least clear from
work  in  computing  science  and  artificial  intelligence  that   such
mechanisms *can* *exist*,  and that  they are  capable, in principle, of
generating many kinds of behaviour (internal  and  external)  previously
thought   to  be  restricted  to  humans,  since  previous  concepts  of
"mechanism" were based on  analogies  with  relatively  simple  physical
systems, like clocks, steam-engines, and telephone exchanges.

     I have  talked  about   mechanisms  which  use  symbols,  including
symbols  expressing  instructions  which  can be obeyed, descriptions of
aspects of the environment, and principles of decision-making. The  most
basic and primitive type of symbol-use is the execution of instructions.
We can even treat the hill and  the water flowing down it as a system in
which  the  shape  of  the  terrain amounts to a sort of stored symbolic
program executed by the water under the influence  of  gravity  and  its
internal  constraints.  But  it  is  a  very  primitive kind of program,
capable of generating a very limited class  of  behaviour,  with  little
capacity  for  producing  qualitatively varied behaviour in the light of
information coming in from outside the system. An earthquake,  or  bomb,
may  change  the  program,  but  the  program  does  not  include  tests
explicitly  anticipating  changes,  with  alternative   strategies   for
achieving  goals.  Nor  can  it  cope  with different goals at different
times. Further, repeated execution may lead  to  changes,  through  soil
erosion  for  example,  but  these  can  only  be gradual and relatively
continuous, unlike the sudden qualitative changes of behaviour of  which
a   self-modifying  computing  system  is  capable.  The  system  cannot
hypothetically explore alternative  internal  changes  then  select  one
which fits some requirement.

     This example is intended both to illustrate how broad the  spectrum
of  mechanisms  is  which  might  be  described as computational, and to
illustrate that the kinds and degrees of  difference  between  different
locations  on  the  spectrum  may  be  so  great  that the metaphor of a
spectrum is an oversimplification.  The example also illustrates how the
same  chunk  of  reality  may  be  viewed  in different ways - e.g. as a
physical system or as a computer executing instructions.

The semantics of internal languages
-----------------------------------

     I have suggested that the most primitive and basic kind  of  symbol
must  be  some kind of *instruction*, i.e. something which generates and
controls behaviour in an  appropriate *interpreter*.  There  is  a  very
varied class of types of instruction, ranging from what might be thought
of simply as physical causes (e.g. the shape of a hillside which only in
a  very  extended  sense  can be said to instruct the water flowing down
it), to very  much  more  "descriptive"  instructions  which  include  a


                                  -7-


                                                                  Page 8


description  of  an  action to be performed (e.g. "turn your head to the
left"), or specify an end state to be achieved  without  specifying  the
action  to  achieve  it  (e.g. "Be here at noon tomorrow", or "find some
food").

     What we are beginning to understand, as a result  of  a  series  of
increasingly  complex computational experiments in the form of designing
and implementing A.I. languages and programs, is that provided you  have
the  first,  most primitive, type of symbol-obeying system, in which the
meaning of a symbol is little more than the effect it has on the machine
(including  such  effects  as  changing  some  of  the symbols), you can
construct on top of it a series of layers of increasingly  sophisticated
virtual  machines,  including ones in which symbols are used to describe
objects and their relationships, and eventually systems in which some of
the  symbols  which  are  interpreted as instructions themselves contain
*descriptive* elements:  for instance in  PLANNER-like  languages  where
procedures  are  invoked  not  by  name  but by some kind of articulated
pattern, which may function as a description of a state of affairs to be
achieved. (E.g. see Winograd 1972).

In short, *descriptive* meaning evolves out of *procedural* meaning, and
more elaborate types of procedural meaning may evolve out of descriptive
meaning.  This evolution has occurred in computing science.  Perhaps  it
also occurred in the development of living organisms.

     We now return to the "central problem of semantics":

Under what conditions can such a mechanism use some of its inner symbols
as  descriptions  of  what  is  the case, descriptions which, instead of
merely producing some effect on the  mechanism,  refer  to  or  describe
things other than themselves, and do so correctly or incorrectly?

Note that the central idea is not having a meaning, but being used  with
a  meaning.   This  central  idea  does  not  normally enter into formal
theories of semantics such as the work of Tarksi:  hence  their  limited
interest  for our purposes.  I don't think the answer to the question is
simple or obvious, and neither  is  it  clear  that  existing  computing
systems  have  reached the kind of sophistication required for us to say
that they *understand* symbols as descriptions, even though in many cases
we clearly attach descriptive meaning to the symbols they use, including
the internal data-structures.  But we also attach  significance  to  the
contents  of  filing  cabinets  and  tape recordings!  A full discussion
would require analysis of different kinds of referential and descriptive
uses of symbols.  A full analysis is not yet available.  However, we can
tentatively formulate some apparently necessary conditions  for  symbols
to be used descriptively.

Preconditions for descriptive meaning
-------------------------------------

     In order that a system S be said to use symbols from a  language  L
to  describe  certain  (types  of)  objects  and  their  properties  and
relations, we could require the following conditions:

M1. S must  be  able  to  use  sensory-detectors  capable  of  receiving
    stimulation  directly  or indirectly (e.g. via light or sound waves)
    from the objects, the  actual  stimulation  being  determined  in  a


                                  -8-


                                                                  Page 9


    principled  fashion by the things and S's relationship to them (e.g.
    visual stimulation depends on viewpoint).

M2. S must be able both to *build* and to *reject* or *modify* descriptions
    using  the  language  L,  based  on  processes  of  analysis  of the
    stimulation mentioned in M1.

M3. S must be able to make *inferences* from some descriptions formulated
    in  L  to  others.   That  is  to  say,  S  must be able to use some
    descriptive symbolic structures as a  starting  point  for  building
    others  related  to  them.  (E.g. A.I. work on visual perception and
    the analysis of  pictures  shows  how  the  construction  of  a  3-D
    interpretation involves an enormous amount of inference making)

M4. S must be capable of noting (in at least some  cases)  that  two  or
    more  descriptions  in  L  of  some  state  of affairs cannot all be
    acceptable (i.e. they are  inconsistent),  and,  in  at  least  some
    cases, capable of taking steps to find out which should be rejected.

M5. S must be capable of using the descriptions in  L  as  a  basis  for
    taking  decisions  about how to act.  More precisely, S must be able
    to use  some  symbols  as  representations  of  possible  states  of
    affairs,  and  also  be  able  to build a description of a series of
    possible actions which would make such a state of affairs actual.

M6. S must be capable of discovering (whether or not  it  expresses  the
    discovery  in  L) that it lacks some information and using a complex
    symbol in L to specify what  is  missing  and  guide  a  process  of
    attempting to acquire the information either by inference from other
    available descriptions, or by using the sense organs.   I.e.  S  can
    use some symbols of L as questions.

     These conditions will be  familiar  to  philosophers  of  language.
They  will  be relaxed somewhat later on.  They do not completely define
the semantic concepts they use, and even as incomplete definitions  they
are  circular.   It  remains  to be seen whether this circularity can be
analysed as an  acceptable  case  of  mutual  recursion.   Even  if  the
circularity  is  acceptable, the real work remains to be done, namely to
flesh out these conditions for different kinds of symbols and  different
aspects  of  the  world,  including, for example, geometrical, physical,
biological  and  social  aspects  of  reality:   the  preconditions  for
meaningfully  using  a  word  to refer to circles will be very different
from the preconditions  for  meaningfully  using  a  word  to  refer  to
cultural revolutions, for example.

     Further analysis of M1 to M6 would take us into hoary debates about
concept-empiricism,   the  verifiability  and  testability  criteria  of
significance, distinctions between referring expressions and predicates,
the  role  of  quantifiers,  modal operators, the importance of implicit
definitions and meaning postulates, e.g. see Ayer (1946), Hempel (1950),
Carnap  (1956),  Pap  (1963), Popper (1959), Quine (1953) and many more.
In particular, we should need to explain how a system may use symbols to
describe objects, properties, and relationships, in a domain to which it
has no direct access, so that it can never completely verify or  falsify
statements  about the domain (see my 1978, chapter 9, and discussions by
philosophers of science of the role of unobservables in  theories,  e.g.
Pap (1963)).


                                  -9-


                                                                 Page 10


     An important idea in such philosophical debates is  that  implicit,
partial,  definitions  (e.g.  in the form of an axiom system) enable new
concepts to get off the ground.  For instance, a  collection  of  axioms
for  Euclidean geometry in the context of a set of inference procedures,
would partially and implicitly define  concepts  like  "line",  "point",
"intersects",  etc.   In  A.I.  programs,  e.g.  programs concerned with
describing visual scenes, instead of axioms and logical inference  rules
we  often  find  a collection of procedures for building data-structures
and  for  relating  them  to  others.   The  procedures  partially   and
implicitly  define  the meanings of the structures. This is a phenomenon
crying out for more formal study.

     The analogy with theoretical concepts of  science  and  mathematics
implies that not all newly-acquired concepts need be *translatable* into
one's previous symbolism. (Compare Fodor 1976.)  It  also  implies  that
the  system may use predicates to describe the environment which are not
definable explicitly in terms of tests which may be applied  to  sensory
data.  Instead, the descriptions are inferred from inconclusive tests on
the basis of theoretical assumptions.  For instance,  the  notion  of  a
'climbable object', or a 'surface moving nearer' need not be *defined* in
terms  of  operations  on  retinal  input,  but  may  be *inferred* from
descriptions  of  retinal  input.  The *meanings* of the symbols used to
describe the environment, will be partially defined by the collection of
inference   rules   (transformation  and  construction  procedures)  and
theoretical postulates (inital data-structures)  used.   The  postulates
and  inference  rules  need not take the forms studied by logicians: for
instance, they may include the use of analogical representations in  the
sense  defined  in  Sloman  (1978),  and  many domain-specific inference
procedures.  The definitions implicit in such assumptions and procedures
will  be inherently incomplete, and the concepts indefinitely extendable
by adding new theoretical assumptions about the nature  of  the  reality
referred  to.   These  features  are  evident in theoretical concepts of
science.  It is not so easy to detect them  in  more  familiar  concepts
like  "hard", "wet", "dog", "food", etc., since we are less conscious of
the inferences we make in ordinary life.

The essential incompleteness of semantics
-----------------------------------------

     Thus  we  may  say  that  intelligent  systems,  like   scientists,
necessarily  use  symbols  without  full understanding, and without ever
being able to establish finally whether what they say is true or  false.
But  this is not something to lament: it is an inevitable fact about the
semantics of a language  used  to  represent  information  about  things
outside  oneself.   This  fact  seems  to  lie  at  the  source  of much
philosophical discussion about knowledge and scepticism.

     So  the  conditions  M1  to  M6  above  do  not  imply that *every*
descriptive  or referential symbol S understands must be one which S can
relate *directly*, using perceptual procedures, to the reality described
or  referred to. The symbol-system L may make contact with reality, e.g.
through S's sense-organs, only at relatively scattered points, and  only
in  indirect  ways (like the connection between reality and our concepts
of 'atom', 'gene', 'the distant past',  'the  remote  future',  'another
person's  mind', 'the cause of an event', 'Julius Caesar', 'the interior
of the sun', 'the battle of  Hastings',  and  so  on).   The  points  of
contact   with   reality   may  vary  considerably  from  individual  to


                                  -10-


                                                                 Page 11


individual, but this need not prevent different individuals storing much
the  same  information about large chunks of the world.  This is because
their inference procedures permit them to extrapolate beyond  what  they
have already learned.  For instance because they can communicate, people
who live in different places can share knowledge about the geography  of
the  earth.   And  different  animals  who  do not communicate may share
knowledge about a forest, gleaned in diferent ways.  All this  has  much
in common with some views expounded in Quine 1953, and Strawson 1959.

     For most of us, most of what we  believe  or  think  about  is  the
result  of  a  process  of  inference, hypothetical construction, use of
indirect evidence, or acceptance of  reports  from  intermediaries,  but
this  doesn't stop us having beliefs and thoughts which refer to more or
less remote portions of the world.  The same  could  be  true  of  other
animals,  or  machines,  even  if their sources of information about the
world are less rich, and include no other communicating intermediaries.

     This view of the semantics of inner  symbolisms  implies  that  the
inner  language  may  be  extended  by the addition of new partially and
implicitly defined symbols.  Contrary to Fodor's claims, a new  language
may  therefore be learnt without any new symbols being translatable into
old ones --  more  on  this  below.   Hence  different  humans  may  use
different  "mentalese"  even  if  they  all  started off the same.  This
admittedly sketchy analysis applies not just to the semantics of  verbal
or  logical  languages, but also to the use of maps and other analogical
representations.

     We could argue at length over whether all of the conditions  M1  to
M6  are  necessary  for  S to use L with descriptive meaning, or whether
some other necessary conditions should be added to  the  list,  such  as
consciousness  of the use being made of symbols.  But such debates would
be fruitless, amounting to little more than semantic squabbles over  how
we  should use words like "symbol", "language", or "meaning".  There are
no doubt many different sorts of cases which could  arise,  forming  yet
another  "spectrum"  ranging  from  systems  which  satisfy only minimal
conditions (see end of this paper) to systems  as  powerful  as  people.
One  of  the goals of AI and Computing Science should be to explore this
range  of   possibilities,   using   both   theoretical   analysis   and
computational experiments.

The primacy of inner languages
------------------------------

     However, what is important in relation to assumption A1 is that the
conditions M1 to M6 are intelligible, and that it makes sense to suppose
that most or all of them might be satisfied by some symbol-using  system
which  is  not  part  of  any  society using any kind of overt language.
Insofar as any communication  is  involved,  it  is  only  communication
between  sub-processes  of  a single system.  Furthermore, I do not know
how we can begin to explain the intelligence of many  forms  of  animals
without assuming that they make use of such internal symbol-systems. The
rich variety of behaviour, the extent  to  which  they  can  match  fine
details  of  their behaviour to the requirements of the environment, the
ability to generalise from one situation to others, the apparent ability
to  acquire  information  and  then use it on another slightly different
occasion, e.g. to avoid danger or to find a new way home  --  all  these
seem   incapable  of  being  explained  without  reference  to  internal


                                  -11-


                                                                 Page 12


processes in which information is stored in some symbol system.

     Furthermore, facts about human infants and the work on learning  in
A.I.  (e.g.   Winston  1975,  Sussman  1975) strongly suggest that human
learning, including  early  language-learning  and  the  development  of
sensori-motor  skills,  could  not  occur without the prior existence in
infants of rich  and  complex  symbol-manipulating  systems  capable  of
forming,  testing,  and modifying both plans and theories. My efforts to
find  alternative  explanations  in  the   writings   of   developmental
psychologists, such as Piaget, have unearthed only vague hand-waving, or
metaphorical re-description of observed behaviour. Work  on  AI  systems
which process English and other natural languages suggests that even the
use of an *overt* language requires  the use of internal  symbolisms for
building  up descriptions and interpretations of fragments of sentences,
and for making inferences from what is actually said.

     All this implies that there are at least three senses in which  the
use  of  an internal symbolism with descriptive and procedural semantics
is prior to, or more fundamental than, the use of an overt language  for
communication between individuals.

P1. The use of some kinds of inner languages must  have  evolved  before
    the  evolution  of what we normally call language, since intelligent
    animals existed before social languages. (Note that  I  am  assuming
    that  something not too different from Darwin's theory of biological
    evolution is correct. Theists may reach different conclusions.)

P2. The use of an inner language is a precondition for the learning of a
    human language like English or Urdu.

P3. The use of an inner language is a precondition for the continued use
    of external languages, but the converse does not hold (in view of P1
    and P2).

I summarise all this in the slogan:

    representation (or symbolisation) is prior to communication.

     If all this is correct then the three  assumptions  A1  to  A3  are
false.   To  rescue  the  assumption  that language essentially involves
communication by making it true by stipulation, would conceal  important
facts  about  the possibility of internal symbolisms which share several
functions with external languages.

Wittgenstein's private language argument
----------------------------------------

     The theory sketched here may appear to fall foul of  Wittgenstein's
(1953)  arguments  against  the  possibility of a private language.  His
argument is that the notion of following a rule is inapplicable  to  the
use  of  some  "logically private" symbolism since the correct/incorrect
distinction could not be used when there is  no  possible  public  check
that  the  rule  has or has not been followed.  This argument has a very
dubious status, but, as Fodor remarks, it is irrelevant to computational
theories,  since  nothing said above implies that the inner symbolism is
*logically* inaccessible to outside scrutiny. In practice the difficulty
of  opening up a brain, or computer and working out what is going on may
be insurmountable, but that is another matter.


                                  -12-


                                                                 Page 13


How should meaning be represented?
---------------------------------

     It is important to distinguish two questions

  (a) How should a theorist  (e.g.  linguist,  psychologist,  logician),
      represent the meanings of symbols of certain kinds?

  (b) How are the meanings of the symbols represented by their user?

     The answers to these questions may be the same for  some  users  of
some  symbols, but they need not be.  When a computer "understands" some
machine language, it does not have or use any explicit representation of
the   meaning.    Rather   the   existence  of  built-in  machinery  for
interpreting (obeying) instructions gives  the  symbolism  its  meaning.
This  need  not  prevent computer scientists from attempting to describe
the semantics of the language explicitly.

     Similarly if  a  relatively  high-level  language  is  interpreted,
instead  of  being  compiled, (a possibility Fodor never discusses) then
the  stored  symbols  may  be  capable  of  generating  behaviour  in  a
systematic   fashion  but  there  need  not  be  any  separate  internal
representation of their meaning:  what meaning they have  is  implicitly
assigned  by  the  procedures  for  interpreting  them  -- possibly in a
context-sensitive  fashion.   The  same  applies  to  descriptive  (non-
procedural) stored symbolism, which is implicitly defined by the way the
system is used, as outlined above. This is discussed briefly by Fodor in
connection  with  "meaning  postulates"  (1976,  pp.149,ff), but only in
relation to the learned public language. He  never  considers  that  the
"mentalese"  symbolism  may,  at least in part, be assigned a meaning in
just this way. The native mentalese can then be  gradually  extended  by
the  addition  of  new  stored representations, partially and implicitly
defining some new primitive symbols, and modifying the meanings (use) of
old  ones.   (It might also evolve through development of the underlying
interpreter.)  The upshot would be a system in which there is  no  clear
functional  distinction  between  the concepts of mentalese and those of
some other language. This is the sort of thing which justifies  a  claim
to  have  learnt  to  think  in  a  new  language.  Fodor's analogy with
compiling a high-level programming language breaks down.  However,  this
is  still too vague and leaves open a wide range of possibilties for the
internal symbolism - including stored sentences in an essentially public
language  (as  when  we memorise a poem or a set of directions to get to
the station), "analogical" representations - e.g. 2-D  arrays  depicting
retinal images or maps, or lists of ordered items representing the order
of events or objects in the world (e.g. a memorised  list  of  names  of
winners at Wimbledon, or a network of routes) and no doubt many more.

     Reflecting consciously on the meaning of an  English  sentence  can
tell  us  little  about  the  myriad  unconscious  processes involved in
producing it, understanding it, inferring it, or believing what it says.
We  still  need  to  learn a great deal about the trade-offs involved in
alternative types of symbolisation. Working in the A.I. paradigm  forces
one  to  address  issues  about  memory,  about  problem-solving,  about
recognition, about learning, about ways  of  achieving  efficiency.  For
instance,  work  in  A.I. suggests that if decisions and interpretations
are required quickly, it will often be useful to store information in  a
highly  redundant  form.  Peano's  axioms  may  suffice  for  a logician


                                  -13-


                                                                 Page 14


interested in number theory,  but  no  computational  system  frequently
having  to  solve  arithmetical  problems could do so reasonably quickly
without storing large numbers of 'partial results', which are  logically
redundant.   Otherwise enormous searches among possible derivations from
the axioms would be required for each new arithmetical task.  Similarly,
the  use of an economically represented generative grammar might be much
more time-consuming than  the  use  of  a  far  more  redundant  system,
including  what  Becker  (1975)  describes  as a 'phrasal lexicon'. This
redundant system would be especially useful if different rules could  be
processed  in parallel, and if incoming information was often incomplete
or degraded by noise, mis-pronunciation, slips  of  the  tongue,  sloppy
sentence-construction, etc. Items rejected by the 'basic' rules might be
found  to  match  relatively  large  stored  schemas  quite  well.  Once
redundancy   enters   into  the  system,  the  scope  for  inconsistency
increases. This could be a major factor in the development of  language.
Thus  the  criteria  favoured by mathematical linguists, such as economy
and consistency of grammars, might be the last things we should  require
of  either  efficient  working  systems  or  theories  of how people and
animals use their private and public languages. We need to explore these
and  other  issues  by  designing and analysing new symbol-using systems
displaying various forms of intelligence. The subject is so young, there
are bound to be many surprises in store for us.

Solipsistic Intelligence
------------------------

     One of the surprises may be that we have  to  weaken  some  of  the
conditions listed above for using symbols with descriptive meaning, such
as M1 and M5.  There could be two machines running programs P1  and  P2,
the  former  connected  to  TV cameras and mechanical arms, as well as a
teletype, and the latter  only  to  a  teletype.  If  P1  satisfies  the
conditions  given  above,  and  P2 is a subset of P1, then under certain
circumstances we may be able to say that P2 contains all  of  P1  except
the links to TV cameras, and perhaps the software for some of the lowest
level analysis of sensory input. Thus if P1 can learn  about  the  world
either  through its cameras, or through the teletype, then why should we
deny that P2 can learn about the world through the teletype alone,  like
a  blind  and  paralysed  person  who  does  not  lack the computational
abilities underlying sight and physical motion? P2 will not  acquire  so
much information so easily. It may have to spend much effort speculating
about details P1 perceives, and it might generate and accept more  false
hypotheses.   Thus,  using  symbols  to formulate beliefs and hypotheses
about an external world does not require  that  the  world  in  fact  be
sensed  and  acted  on, only that the internal symbols and procedures be
sufficiently rich to have the potential to  support  such  processes  of
interaction with the world.

As far as I know, A.I. work on language understanding has not yet really
begun  to  address  the  question  of how such potential can be shown to
exist in a program which can communicate only via a teletype.  It may be
that  there  is no adequate test short of actually embedding the program
in a more  complete  system,  though  I  hope  it  will  turn  out  that
theoretical analysis will be possible instead.  The thought processes of
such a system might  not  be  too  different  from  some  human  thought
processes  disconnected  from  the  real  world,  such  as religious and
metaphysical thinking, and some  kinds  of  mathematical  thinking,  for
instance about infinite-dimensional spaces.  The final step is to notice


                                  -14-


                                                                 Page 15


that not even the teletype is necessary!


                            Acknowledgements
                            ----------------

     Discussions with Frank O'Gorman, Phil Johnson-Laird, Steve  Draper,
Bill  Woods, Laurie Hollings, John Lyons and students attending graduate
seminars in the Cognitive Studies Programme at Sussex University, helped
me  to  formulate  some  of the issues and revise an early draft, as did
unpublished work by Gerald Gazdar on 'Constituent Structures'.  Some  of
the   problems  arose  out  of  work  on  a  project  on  "Computational
flexibility in visual perception", funded by the SRC  grant  BRG/8688.7.
Judith Dennison helped with production.


                              BIBLIOGRAPHY
                              ============

Ayer, A.J., Language Truth and Logic 2nd edition, Gollancz, 1946.
Becker,, J.D. 'The phrasal lexicon', in R.  Schank  and  B.  Nash-Webber
     (eds) Theoretical Issues in Natural Language Processing Association
     of Computational Linguistics, 1975.
Boden, Margaret,  Artificial  Intelligence  and  Natural  Man  Harvester
     Press, and Basic Books.  1977.
Carnap, R., Meaning and Necessity Phoenix Books 1956.
Chomsky, Noam, Reflections on Language Temple Smith, and Fontana, 1976.
Fodor, J.A., The Language of Thought Harvester Press 1976.
Hempel, C.G, 'The Empiricist Criterion of Meaning' in  A.J.  Ayer  (Ed.)
     Logical Positivism, The Free Press, 1959.  Originally in Revue Int.
     de Philosophie,_Vol.4.  1950.
Lyons, John, Semantics Cambridge University Press.  1977.
Pap,  A.,  An  Introduction  to  the  Philosophy  of  Science  Eyre  and
     Spottiswoode (Chapters 2-3).  1963.
Popper, K.R., The Logic of Scientific Discovery Hutchinson, 1959
Quine, W.V.O., 'Two Dogmas of Empiricism' in From  a  Logical  point  of
     view 1953.
Sloman,  Aaron,  The  Computer  Revolution  in  Philosophy:   Philosophy
     Science  and  Models  of  Mind,  Harvester Press and The Humanities
     Press.  1978.
Strawson, P. F., Individuals:   An  Essay  in  Descriptive  Metaphysics,
     Methuen.  1959.
Sussman, G.J., A Computer Model of Skill Acquisition American  Elsevier.
     1975.
Winograd, T., Understanding Natural Language Edinburgh University  Press
     1972.
Winston, P.H., The Psychology of Computer Vision McGraw Hill, 1975.
Winston, P.H., Artificial Intelligence, Addison Wesley, 1977.
Wittgenstein, L., Philosophical Investigations, Blackwell, 1953.


                                  -15-