School of Computer Science

Meta-Morphogenesis and Toddler Theorems:
Case Studies
Part of the Meta-Morphogenesis project
(DRAFT: Liable to change -- needs much reorganisation!)

Aaron Sloman
[a dot sloman at bham dot ac dot uk]
School of Computer Science, University of Birmingham.


Jump to CONTENTS List
PARTIAL HISTORY OF THIS PAGE
Installed: 7 Oct 2011
Last updated: 25 Dec 2017 (reorganised). Format changes May 2020. 17 Nov 2020 added Possibility comparisons.
19 Aug 2017 (Added new version of toddler+pencil video, with commentary).
30 Jun 2017 (Added door-closing example.)
18 May 2015 (Added pencil/hole example); 23 Jun 2015 (Intro revised); 8 Jul 2015
12 May 2015 (Considerable re-formatting)
11 May 2015
(Added linking/unlinking rings/loops example: Impossible transitions involving rings)
24 Oct 2014 (Moved drawer theorem to introduction.)
18 Jun 2014 (Revised introduction. More references.); 13 Jul 2014; 25 Sep 2014; 24 Oct 2014
22 May 2014 (More references. New introduction. Some reorganisation.); 4 Jun 2014
12 Sep 2013 (reorganised, and table of contents improved); 4 Oct 2013; 19 Mar 2014
28 Sep 2012; 10 Apr 2013 (including re-formatting); 8 May 2013; 7 Aug 2013;
9 Oct 2011; 21 Oct 2011; 29 Oct 2011; ....; 7 Jul 2012; ... 23 Aug 2012;

______________________________________________________________________________

This web page is
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html
Or: http://goo.gl/QgZU1g
A messy automatically generated PDF version of this file (which may be out of date) is:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.pdf

This is one of a set of documents on meta-morphogenesis, listed in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html

A partial index of discussion notes is in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/AREADME.html

______________________________________________________________________________

CONTENTS


CONTENTS List

Introduction

The study of toddler theorems is the study of the variety of types of proto-mathematical learning and development in young children and other animals that in humans are the precursors of explicit mathematical competences and achievements. The transition from proto-mathematical understanding to mathematical understanding seems to require at least a meta-cognitive "layer", and later on several "stacked" meta-cognitive layers, in the developing information-processing architecture.

Such meta-cognitive layers, allowing what is and is not known to be noticed and thought about, may not be available at birth, but seem to develop later on (at various stages) in normal humans but not in other animals, though some other animals may have partial forms. But crucial proto-theorems may be discovered without such meta-meta-cognition, and used, unwittingly, and often without being noticed by doting parents and researchers.

The claim is that even pre-verbal toddlers can make discoveries about what is and is not possible in various situations, and put those discoveries to use, but without knowing they are doing that. This is a deeper and, for humans, more important ability than the ability to acquire statistics-based abilities to predict what is very likely or very unlikely. Sets of possibilities are logically, metaphysically, and cognitively prior to probabilities -- a claim that will be discussed in another document later.

A core hypothesis is that there are important forms of learning that involve being able to discover sets of possibilities (Piaget 1981) inherent in a situation and their constraints or necessary connections (Piaget 1983). This is a much deeper aspect of intelligent cognition than discovery of correlations, as in reinforcement learning, e.g. using Bayesian nets. (Here's a simple tutorial Bayes Nets.)

Example: The drawer-shutting theorem

Several years ago, Manfred Kerber reported that one of his children, when very young, developed a liking for shutting open drawers.

He would put both hands on the rim of the open drawer and push: OUCH!

Eventually he discovered a different way that avoided the pain.

If you push a close-fitting drawer shut with your fingers curled over the top edge your fingers will be squashed, because, although it is possible for the open-drawer to be pushed towards the shut position, it is impossible for it to avoid squashing the curled fingers (if they stay curled during pushing.)

drawer

On which hand will the fingers be squashed when the drawer is pushed shut?
(Figure added 14 Oct 2014)
(Apologies for low quality art.)

Is the discovery that using the flat of your hand to push a drawer shut avoids the pain a purely empirical discovery? Or could the consequence be something that is worked out, either before or after the action is first performed that way. Perhaps that is a toddler theorem -- for some toddlers?

What sorts of representational, architectural, and reasoning (information manipulation) capabilities could enable a child to work out

The answer seems to have two main aspects, one non-empirical, to do with consequences of surfaces moving towards each other with and without some object between them, and the other an empirical discovery about relationships between compression of, or impact on, a body part and pain or other experiences.

A sign that the child has discovered a theorem derived in a generative system, may be the ability to deal with other cases that have similar mathematical structures, despite physical and perceptual differences, e.g. avoiding trying to shut a door by grasping its vertical edge, without first trying it out and discovering the painful consequence.

Perceiving the commonality between what happens to the edge of a door as it is shut (a rotation about a vertical axis) and what happens to the edge of a drawer when it is shut (a translation in a horizontal plane) seems to require the ability to use an ontology that goes beyond sensory-motor patterns in the brain, and refers to structures and processes in the environment: an exosomatic ontology.

Once learnt, the key facts can be abstracted from drawers and horizontal edges and applied to very different situations where two surfaces move together with something in between, e.g. a vertical door edge. As Immanuel Kant pointed out in 1781, the mathematical discoveries may initially be triggered by experience, but that does not make what is learnt empirical, unlike, for example, learning that pushing a light switch down turns on a light. No matter how many examples are found without exceptions this does not reveal a necessary connection between the two events. Learning about electrical circuits can transform that knowledge, however.

There seem to be many different domains in which young children can acquire perceptual and cognitive abilities, later followed by development of meta-cognitive discoveries about what has previously been learnt, often resulting in something deeper, more general, and more powerful than the results of empirical learning. The best known example is the transition in young children from pattern-based language use to grammar based use, usually followed by a later transition to accommodate exceptions to the grammar. Like Annette Karmiloff-Smith, whose ideas about 'representational re-description' are mentioned below, I think this sort of transition (not always followed by an extension to deal with counter examples) happens in connection with many different domains in which children (and other animals) gain expertise. Moreover, as proposed in a theory developed with Jackie Chappell (2007)) and illustrated below in Figure Evo-Devo, this requires powerful support from the genome, at various stages during individual development.

The mathematical and proto-mathematical learning discussed in this document cannot be explained by the statistical mechanisms for acquiring probabilistic information now widely discussed and used in AI, Robotics, psychology and neuroscience. Evolution discovered something far more powerful, which we do not yet understand. Some philosophers think all mathematical discoveries are based on use of logic, but many examples of geometrical and topological reasoning cannot be expressed in logic, and in any case were reported in Euclid's Elements over two thousand years ago, long before the powerful forms of modern logic had been invented by Frege and others in the 19th Century. I'll make some suggestions about mechanisms later. Building and testing suitable working models will require major advances in Artificial Intelligence with deep implications for neuroscience and philosophy of mathematics.

Re-formulating an empirical discovery into a discovery of an impossibility or a necessary connection is sometimes more difficult than the drawer case (e.g. you can't arrange 11 blocks into an NxM regular array of blocks, with N and M both greater than 1 -- why not?). Different mechanisms may have evolved at different stages, and perhaps in different species, for making proto-mathematical discoveries. Transformations of empirical discoveries into a kind of mathematical understanding probably happens far more often than anyone has noticed, and probably take more different forms than anyone has noticed. They seem to be special subsets of what Annette Karmiloff-Smith calls "Representational Redescription", also investigated by Jean Piaget in his last two books, on Possibility and Necessity.

Proto-mathematical understanding may be acquired and used without the learner being aware of what's happening. Later on, various levels and types of meta-cognitive competence develop, including the ability to think about, talk about, ask questions about and in some cases also to teach others what the individual has learnt. All of this depends on forms information processing "discovered" long ago by the mechanisms of biological evolution but not yet understood by scientists and philosophers of mathematics, even though they use the mechanisms. Arguments that languages and forms of reasoning must have evolved initially for internal, "private", use rather than for communication can be found in Talk 111.

The aim of this document is mainly to collect examples to be found during development of young children. Discussions of more complex examples, and requirements for explanatory mechanisms, can be found in other documents on this web site. This one of many strands in the Meta-Morphogenesis project.

Added 18 Jun 2014:
One problem for this research is that it can't be done by most academic developmental psychologists because the research requires detailed, extended, observation of individuals, not in order to discover regularities in child cognition and development, but in order to discover what sorts of capabilities and changes in capabilities can occur. This is a first step to finding out what sorts of mechanisms can explain how those capabilities and changes are possible (using the methodology in chapter 2 of Sloman (1978), expanded in this document on explaining possibilities. This requires the researchers to have kinds of model-building expertise that are not usually taught in psychology degrees. (There are some exceptions, though often the modelling tools used are not up to the task, e.g. if the tools are designed for numerical modelling and the subject matter requires symbolic modelling.)

This is not regarded as scientific research by a profession many of whose members believe (mistakenly) the Popperian myth that the only reportable scientific results in psychology must be regularities observed across members of a population, and where perfect regularities don't exist because individuals differ, then changes in averages and other statistics should be reported.

In part that narrow, unscientific mode of thinking is based on a partial understanding of the emphasis on falsifiability in Karl Popper's philosophy of science, which has done a lot of harm in science education. What is important in Popper's work is the idea that explanatory theories should have consequences that are as precise and general as possible. But they may not be falsifiable for a long time because the theory does not entail regularities in observables, and does not make predictions about all or even some proportion of learners.

Instead it may successfully guide searches for new, previously unnoticed, types of example covered by those possibilities. A later development of the theory could provide suggestions regarding explanatory mechanisms. For such mechanisms it is more important to produce working models demonstrating the potential of the theory than to use the theory to make predictions. Such research sometimes gains more from detailed long term study of individuals, and speculative model building and testing, than from collection of shallow data from large samples.

For more on the scientific importance of theories explaining how something is possible see

http://www.cs.bham.ac.uk/research/projects/cogaff/misc/explaining-possibility.html
Construction kits as explanations of possibilities

Teaching based on a deep theory may make a huge difference to the performance of a small subset of high ability learners even if the theory does not specify how those learners can be identified in advance as a basis for making predictions. Moreover the theory may explain the possibility of a variety of developmental trajectories that can be observed by good researchers when they occur, though theory may not (yet) give clues as to which individuals will follow which trajectories. Many biological theories have that form, e.g. explaining how some developmental abnormalities can arise without being rich enough to predict which individual cases will arise. In some cases that may be impossible in principle if the abnormalities depend on random chemical or metabolic co-occurrences during development about which little is known.

A theory explaining how sophisticated mathematical competences can develop may make no falsifiable predictions because there are no regularities -- especially with current teaching of mathematics in most schools. (Unless I've been misinformed.)
[27 Sep 2014: To be expanded, including illustrations from linguistic theory.]

One of the bad effects of these fashions is that the only kind of recommendation for educational strategies such a researcher can make to governments and teachers is a recommendation based on evidence about what works for all learners, or, failing that, what works for a substantial majority of learners.

(E.g. a recommendation to teach reading using only the phonic method -- which assumes that the main function of reading is to generate a mapping from text to sounds, building on the prior mapping from sounds to meanings. That recommendation ignores the long term importance of building up direct mappings from text to meanings operating in parallel with the mapping from sounds to meanings, and construction of architectural components not required for reading out loud but important for other activities later on, e.g. inventing stories or hypothetical explanations.)

Another bad effect of the emphasis on discovering and reporting what normally does happen rather than what can happen is to deprive psychology of explanatory theories able to deal with outliers, such as Bach, Mozart, Galileo, Shakespeare, Leibniz, Newton, Einstein, Ramanujan, and others. In contrast, a deep theory about what is possible and how it is possible can account both for what is common and what is uncommon, just as a theory about the grammatical structure of English can explain both common utterances and sentences that are uttered only once, like this one.

A tentative proposal:
The examples of toddler theorem discoveries given below are isolated reports of phenomena noticed by me and various colleagues, along with cases presented in text books, news reports or amateur videos on social media. Perhaps this web site should be augmented with a web site where anyone can post examples, and where development of individual babies, toddlers and children over minutes, hours, days, weeks, months or years can be reported. Something like Citizen Science for developmental psychology? Any offers to set that up?

More examples are presented below.


CONTENTS List
  1. THIS DOCUMENT
    This document reports cases of observed or conjectured discoveries of toddler theorems by children of various ages. Ideally such a survey should be developed in the context of a theoretical background that might include the following items:
    • A theory of the types of information processing architectures that can exist at different stages of development of intelligent individuals. This might include an abstract architecture schema covering a wide range of possible information-processing architectures and a wide range of possible requirements for developing intelligent animals or machines so that different architectures and different sets of requirements (niches) can be located in that framework. A possible framework, still requiring much work, is summarised in
      http://www.cs.bham.ac.uk/research/projects/cogaff/#overview.

      Some of the components and functions required in animal or robot information processing architectures are crudely depicted and sub-divided in the figure below, where processes and mechanisms at lower levels are generally evolutionarily much older than those at higher levels, and probably develop earlier in each individual, though new ones may be added later through training:

      Figure CogArch

      New grid

      (Recently revised diagram of CogAff Schema, thanks to Dean Petters.)

      Note: the above diagram simplifies many important features of required architectures, including the "alarm" processing routes and mechanisms described in other CogAff papers (allowing asynchronous interruption or modulation of ongoing processes, e.g. to meet sudden threats, opportunities, etc.) Mechanisms related to use of language are distributed over all the functional subdivisions between columns and layers.

      The architectural ideas are discussed in relation to requirements for virtual machinery here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/vm-functionalism.html
      Including an older version of the Human Cogaff (h-cogaff) diagram, namely

      An older version: the H-Cogaff architecture
      NOTE: added here 6 Feb 2021

      This version includes "personae": a collection of personalities that can be available to take control of the system, according to context, with various relationships between them, including competition in some pathologies. It also did not bring out the overlaps between perception and action indicated in the previous diagram (e.g. sensing a surface texture can involve sliding a finger along it, sensing weight can involve lifting or pushing the object sensed).

      Figure Old H-Cogaff
      Old H-Cogaff

      The development of proto-mathematical and mathematical competences listed below make use of mechanisms, including changing mechanisms, in all the layers and columns of mechanisms depicted in the above diagrams. No diagram, however, can adequately represent the richness and diversity of components and the functionality they add. no

      Note: 25 Dec 2017
      After collecting many examples of competences to be explained, especially the competences involved in ancient discoveries in geometry and topology, long before the development of the modern logic-based axiomatic method and use of Cartesian coordinates to represent geometry, I have begun to explore the possibility that a kind of "Super-Turing" information processing mechanism must have been produced by evolution. The ideas will be elaborated in
      http://www.cs.bham.ac.uk/research/projects/cogaff/misc/super-turing-geom.html

    • A theory of types of information processing mechanism available at various stages during the individual's development at various stages of evolution. Clearly the initial mechanisms (in a fertilised egg or seed) are purely chemical. In many organisms, though not the majority, nervous systems of various sorts are grown under the control of complex information processing mechanisms that are slowly being unravelled. The nervous systems provide new mechanisms that continue to develop themselves and control more and more biological functions. Although there have been tremendous advances in our knowledge I think there may still be far more to be discovered in the remainder of this century than has already been learned. In particular, as hinted by Turing in his 1950 paper, brains may make far more important use of chemical information processing than has so far been noticed.

    • A schematic theory of iteratively developed, increasingly sophisticated, types of interactions between genome and environment during individual development -- forming increasingly complex domains of competence. A first draft theory of this type is outlined in Chappell & Sloman (2007), which included an earlier version of this diagram crudely summarising interactions between genome and environment:

      FIG EVO-DEVO: The Meta-Configured Genome (MCG)
      evol
      [New version of diagram installed here: 12 May 2015]
      [Chris Miall helped with the original version of this diagram.]
      Compare Waddington's "Epigenetic Landscape". Our proposal is that for
      some altricial species developmental processes rebuild or extend the
      landscape at various stages during development, and then choose newly
      available developmental routes after rebuilding, instead of merely
      choosing a trajectory on a fixed epigenetic landscape.
      For later work on the MCG theory (including video) follow this link
      https://www.cs.bham.ac.uk/research/projects/cogaff/movies/meta-config/

      One of the features of a system like this is that if the stages are extended in time, and if the earlier stages include development of abilities to communicate with conspecifics and acquire information from them, then later developments (to the right of the diagram) can be influenced by not only the physical and biological environment as in other altricial species, but also by a culture.

      As we see on this planet, that can have good effects, such as allowing cultures to acquire more and more knowledge and skill, and bad effects such as allowing religious ideas, cruel practices, superstition, and in some cases "mind-binding" processes that prevent the full use of human developmental potential, as discussed in:
      http://www.cs.bham.ac.uk/research/projects/cogaff/misc/teaching-intelligent-design.html#softwarebug 'Religion as a software bug'

      Note:
      I hope to show later on how the above model of interactions between genome and environment in individual members of advanced species can be modified to produce a partly analogous model of how evolution works within a portion of the physical universe. Both are examples of dynamical systems with creative powers, able to transform themselves not merely by adjusting numerical parameters but by introducing new abstract types of structure and types of causal power, which can later be instantiated in different ways in different contexts. This can be seen as partly analogous to abductive reasoning, in which evidence inspires formulation of a new explanatory hypothesis that is added to previous theories, in some cases with new undefined symbols that "grow" semantic content through deployment of the theory. See
      http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-configured-genome.html


    CONTENTS List

  2. WHAT ARE TODDLER THEOREMS?
    Here are some facts whose significance does not seem to be widely appreciated:
    • Many non-human animal species have cognitive abilities (including perceptual abilities) that require the use of a rich expressive internal language with generative power and compositional semantics.
    • The same is true of pre-verbal children, though not all of the mechanism exists at birth: there is a process of growth of the information-processing architecture driven partly by the environment and partly by the genome.
    • In both cases there is a type of learning that is not included in the standard taxonomies of learning (from observation, from statistical relationships, from experiment, from imitation, from instruction by others), namely a process of learning by working things out which in adult humans most obviously characterises mathematical discoveries, including discovery or creation of
      • new powerful concepts (extending the ontology used)
      • new powerful notations (formalisms)
      • new forms of calculation or reasoning
      • new conjectures
      • new proofs
      • new implications of what was previously known
    • The biological precursors of abilities to do mathematics explicitly are mechanisms that allow animals and very young children to solve practical problems, including novel problems, without going through lengthy processes of trial and error and without having to take risky actions whose possible consequences are unknown -- a capability famously conjectured by Kenneth Craik in
      The Nature of Explanation (1943).
    • Although Immanuel Kant, Max Wertheimer, Jean Piaget, Konrad Lorenz, Lev Vygotsky, John Holt, and others noticed examples of this kind of ability in human learners, and sometimes in other species, the links with adult mathematical competences are unclear, and as far as I know have not been studied or modelled.
    • I suggest that without these ancient biological capabilities humans could never have made the discoveries that were later organised cooperatively in systems of knowledge, such as Euclid's Elements (whose contents and methods are unfortunately no longer a standard part of the education of bright children -- with dire consequences for many academic disciplines, including psychology and education).
    • I suspect that many of the more basic ancient discoveries, and others that have never been documented, are repeated by young children without anyone noticing. A few years ago I started using the label "toddler theorem" to express this idea, though I don't think the discoveries are restricted to the age-range normally covered by the label "toddler". However, the mechanisms required are probably not all available at birth: evolution discovered the benefits of delaying development of meta-cognition until a substantive collection of information had been acquired (explained in more detail in Chappell and Sloman (2007)).
    • In fact, the discovery processes can continue throughout life and lead to many solutions to practical problems as well as advances in engineering, science and mathematics, though individuals vary in what they can achieve, and the extent to which they use the potential they have (education can be very damaging in this respect).
    • Many of the discovery processes appear to be examples of what Annette Karmiloff-Smith has called "Representational Redescription", summarised in this (incomplete) introduction to her work: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html
    • There are deep implications for philosophy of mathematics, including the problems I addressed in my DPhil thesis (1962), which was an attempt to defend Kant's philosophy of mathematics.
    • I think some aspects of the forms of reasoning used in the discovery of toddler theorems are not yet represented in AI or robotic systems, and it may even be very difficult or impossible to implement some of them on Turing machines and digital computers because they use the interplay between continuous and discrete structures and processes. Readers who have no idea what I am talking about may find it helpful to look at some examples, e.g. a discussion of some of what can be learnt by playing with triangles.
    • The main aim of this web site is to introduce the idea of a "toddler theorem" and present examples. I suspect that with help from observant parents, grandparents, teachers, and animal cognition researchers, the list of examples of "toddler theorems" should grow to include many hundreds of types of example.
    • Since many theorems involve a domain (a class of structures or processes or relationships) I have also included below a brief discussion of the concept of domain (also used by Karmiloff-Smith and many others, though with varying terminology). The ideas are developed at greater length in this discussion document http://www.cs.bham.ac.uk/research/projects/cogaff/misc/bio-math-phil.html and in this presentation to the PT-AI 2013 conference: http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk108
      Also closely related, is this presentation on the roles of richly structured internal languages, and why they must have evolved before languages for communication, and why they need to develop in advance of use of language for communication.

  3. NOTE: The word 'toddler' can be interpreted broadly for our purposes:
    3.a. The Toe-ball example
    Pre-toddler theorems? (Added 24 Sep 2014)
    XX

    For example, this 11-month old child is not a toddler, as she cannot yet walk, and has recently learnt to crawl, but she seems to have made a discovery about things that can be supported between upward pointing toes and downward facing palm.

    Whether that's a "theorem" for her depends on whether she was able (using whatever representational resources are available to a pre-verbal human) to reason about the consequences of previously acquired information about affordances so as to predict what would happen in this novel situation, or retrospectively to understand why it happens if it first happened unintentionally.

    Clearly whatever initiated the processes she continued it intentionally and even seemed to be trying to share what she had discovered with someone not in the picture. The differences between possible cases need further investigation elsewhere. There are also many examples involving actions that produce changes of posture (e.g. from lying on back or belly to sitting upright) and various crawling actions that provide forward or backward motion or change of direction.

    As for why children do such things, I believe the normal assumption that all motivation must be reward based is false, as discussed below in the section on Architecture-Based motivation.

    Another pre-toddler-theorem in the case of this child seems to be that the transition between

    -- crawling forward with legs stretched backward (position a, below), and
    -- sitting on the ground with legs projecting forward and
       facing roughly in the original position (position d)
    can be achieved by temporarily extending legs sideways, aligned as in a hinge joint, as illustrated by positions (b) and (c) in the sequence below. She also uses the same intermediate state for the reverse transition. (The much more common strategy involves rolling over on one side before or after changing direction.)

    XX XX
    (a)                               (b)
    XX XX
    (c)                             (d)

    I would be grateful for information about any other infants who
    use this or a related method for doing the 90 degree rotation of
    torso and roughly 180 degree rotation of legs.

    3.b. A crawler's door-closing theorem
    (Added 1 Jul 2017) This is based on a recollected episode over a decade ago, when a baby and his parents were visiting us. At one point he crawled from the front hall into an adjoining room, indicating that he wanted me to follow him (e.g. stopping, waiting and looking round at me if I paused while following him). After he had crawled through the door and waited for me to follow him, he wanted the door shut. (I have no idea why, perhaps he had no reason.) He managed to push it shut with his feet, after crawling to an appropriate location, rolling over onto his back, swinging his legs back round the door, then pushing shut.

    That action can be thought of as a proof (by construction) of the theorem that it is possible to shut a door with your feet after crawling through the doorway.

    How was the intention to do all that represented in his brain (or mind) long before he could say anything in words?

    Fig: Crawler
    Crawler works out how to shut door after crawling through it
    XX
          Crawls through open door facing into room

    XX
          Rolls over onto back to push door shut with feet.

    At the time, I did not think of asking his parents whether he had been taught to do that, or had regularly been doing it at home. In either case he seemed to understand what he was doing, and was able to manoeuvre into the right position, to get the door shut the first time he tried in our house, which had a very different layout from his home.

    What kind of representation of spatial structures, relationships, and possibilities for change could a child's brain use (a) in forming the intention to perform such an action, and (b) in actually doing it? I suspect the answer will refer to precursors to the mechanisms that enabled ancient mathematicians to make profound mathematical discoveries.
    See also: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/impossible.html


  4. BACKGROUND
    • Philosophy of Mathematics, AI, Representational Redescription and Toddler Theorems
      There are problems about human spatial reasoning abilities and other non-logical reasoning abilities that I started thinking about when working on my DPhil in Philosophy of Mathematics, Oxford 1962
      "Knowing and Understanding:
          Relations between
              meaning and truth,
              meaning and necessary truth,
              meaning and synthetic necessary truth
      
      http://www.cs.bham.ac.uk/research/projects/cogaff/sloman-1962
      (Digitised version installed 2016).

      This argued (e.g. against Hume) that Immanuel Kant was right in claiming in 1781 that in addition to

      1. true empirical propositions that in principle could be refuted in experiments and observations with novel conditions and
      2. analytic, essentially trivial, truths that depend only on definitions and their logical consequences, and whose discovery does not extend factual knowledge, apart from knowledge of logical consequences of collections of definitions, including unobvious consequences,

      there are also truths that are neither empirical nor trivial but provide substantial knowledge, namely synthetic, necessary, truths of mathematics, whose discovery requires non-empirical reasoning capabilities.

      Some of the concepts used here are explained in this summary of parts of my DPhil thesis:
      "'NECESSARY', 'A PRIORI' AND 'ANALYTIC'" (1965)
      http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#1965-02

      Two more papers based on the thesis work were published in 1965 and 1969:
      http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#rog
           Functions and Rogators (1965)
      http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#1968-01
           Explaining Logical Necessity (1968-9)

      Around 1970 Max Clowes introduced me to Artificial Intelligence, especially AI work on Machine vision. That convinced me that a good way to make progress on my problems might be to build a baby robot that could, after some initial learning about the world and what can happen in it, notice the sorts of possibilities and necessities (constraints on possibilities) that characterise mathematical discoveries. My first ever AI conference paper distinguishing "Fregean" from "Analogical" forms of representation was a start on that project, followed up in my 1978 book, especially Chapters 7 and 8.

      From about 1973, I was increasingly involved in AI teaching and research and also had research council funding for a project on machine vision, some results of which are summarised in chapter 9 of CRP. Later work (teaching and research) led me in several directions linking AI, Philosophy, language, forms of representation, architectures, relations between affect and cognition, vision, and robotics. Progress on the project of implementing a baby mathematician was very slow, mainly because the various problems (especially about forms of representation) turned out to be much harder than I had anticipated. Moreover, I did not find anyone else interested in the project.

      In 2008 Mary Leng jolted me back into thinking about mathematics by inviting me to give a talk in a series on mathematics at Liverpool University. In that talk and in a collection of subsequent papers and presentations I tried to collect examples and arguments about how various aspects of mathematical competence could be seen to arise out of requirements for interacting with a complex, structured, changeable environment. I did not find anyone else who shared this interest, perhaps because the people I met had not spent five years between the ages of five and ten playing with meccano? http://www.cs.bham.ac.uk/research/projects/cosy/photos/crane/

    • WHAT IS A DOMAIN?
      The meta-domain of meta-domains ... of domains
      Note added 4 Mar 2015
      I've recently added a discussion of "construction kits" produced by and used by evolution and development, including concrete construction kits, abstract construction kits and mixed construction kits. Some sorts of domain will be related to (or generated by) a particular sort of construction kit (which itself may be a mixture of simpler construction kits). For more on construction kits see:
      http://www.cs.bham.ac.uk/research/projects/cogaff/misc/construction-kits.html
      (Domains are sometimes called "micro-worlds")

      Added 23 Aug 2012:
      Although I started this web page in October 2011, I have been working on many of these themes for many years using different terminology. E.g. some of the ideas about numbers go back to chapter 8 of my 1978 book, but that builds on my 1962 Oxford DPhil Thesis (attempting to defend Kant's philosophy of mathematics -- before I knew anything about computers or AI).

      After discovering the deep overlap with ideas Annette Karmiloff-Smith (AK-S) had developed, especially in her 1992 book, which I have begun to discuss in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html I thought it might be helpful to use her label "domain", instead of the collection of labels I have been playing with over several decades (some of which have been widely used in AI, others in mathematics, software engineering, etc. -- the ideas are deep and pervasive).

      I can't now remember all the labels I have used, but the following can be found in some of my papers, talks, and web pages, with and without the hyphens:

          'micro-world'
          'mini-world'
          'micro-domain'
          'micro-theory'
          'theory'
          'framework'
          'framework-theory'
      

      What is a domain?
      I don't think there is any clear and simple answer to that question. But this document presents several examples that differ widely in character, making it clear that domains come in different shapes and sizes, with different levels of abstraction, different kinds of complexity, different uses -- both in controlling visible behaviour and in various internal cognitive functions --, different challenges for a learner, different ways of being combined with other domains to form new domains, and conversely, different ways of being divided into sub-domains, etc.

      We might try to compare different sub-fields of academic knowledge to come up with an analysis of the concept of domain, but there are many overlaps and many differences between such domains as philosophy, logic, mathematics, physics, chemistry, biology, biochemistry, zoology, botany, psychology, developmental psychology, gerontology, linguistics, history, social geography, political geography, geography, meteorology, astronomy, astrophysics, ....

      Moreover within dynamic disciplines new domains or sub-domains often grow, or are discovered or created, some of them found to have pre-existed waiting to be noticed by researchers (e.g. planetary motions, Newtonian mechanics, chemistry, topology, the theory of recursive functions) while others are creations of individual thinkers or groups of thinkers, for example, art forms, professions (carpentry, weaving, knitting, dentistry, physiotherapy, psychotherapy, architecture, various kinds of business management, divorce law in a particular country, jewish theology, and many more). However, that distinction, between pre-existing and human-created domains, is controversial with fuzzy boundaries.

      Philosophers' concepts of "natural kinds" are attempts to make some sort of sense of this, in my view largely unsatisfactory, in part because many of the examples are products of biological evolution, and some are products of those products. I suspect the idea of "naturalness" in this context is a red-herring, since the distinction between what is created and what was waiting to be discovered is unclear and there are hybrids.

      The distinction between "logical geography" (Gilbert Ryle) and "logical topography" (me), is also relevant, explained in http://tinyurl.com/BhamCog/misc/logical-geography.html,

      A particularly rich field of human endeavour in which hierarchies of domains are important is software engineering, and the discovery of this fact has led to the creation of various kinds of programming languages for specifying either individual domains or families of domains. For example, so-called "Object Oriented Programming" introduced notions of classes, sub-classes, instances, and associated methods (class-specific algorithms) and inheritance mechanisms. More sophisticated OOP languages allowed multiple inheritance and generic functions (methods that are applicable to collections of things of different types and behaviour in ways that depend on what those types are).

          http://tinyurl.com/PopLog/teach/oop
      
      Note added 4 Mar 2015
      Using the notion of construction kit presented in
      http://www.cs.bham.ac.uk/research/projects/cogaff/misc/construction-kits.html
      we can say that many domains are "generated" or "defined" by a particular type of construction kit (which may be composed of simpler construction kits). We need a more thorough survey and analysis of cases.

      More generally we can say that a domain involves relationships that can hold between types of thing, and instances of those types can have various properties and can be combined in various ways to produce new things whose properties, relationships, competences and behaviours, depend on what they are composed of and how they are combined, and sometimes the context. Often mathematicians specify such domain-types without knowing (or caring) whether instances of those types actually existed in advance (e.g. David Hilbert's infinite dimensional vector spaces?) Additional domains are summarised below.

      Formation of a new instance of a type in a domain can include assembling pre-existing instances to create larger items (e.g. joining words, sentences, lego bricks, meccano parts dance steps, building materials, mathematical derivations), or can include inserting new entities within an existing structure, or changing properties, or altering relationships. E.g. loosening a screw in a meccano crane can sometimes introduce a new rotational degree of freedom for a part.

      Some domains allow continuous change, e.g. growth, linear motion, rotation, bending, twisting, moving closer, altering an angle, increasing or decreasing overlap, changing alignment, getting louder, changing timbre, changing colour, and many more (e.g. try watching clouds, fast running rivers, kittens playing, ...). Some allow only discrete changes, e.g. construction of logical or algebraic formulae, or formal derivations, operations in a digital computer, operations in most computational virtual machines (e.g. a Java or lisp virtual machine), some social relations (e.g. being married to, being a client of,), etc.

      The world of a human child presents a huge variety of very different sorts of domains to be explored, created, modified, disassembled, recombined, and used in many practical applications. This is also true of many other animals. Some species come with a fixed, genetically determined, collection of domain related competences, while others have fixed frameworks that can be instantiated differently by individuals, according to what sorts of instances are in the environment, whereas humans and others (often called "altricial" species) have mechanisms for extending their frameworks as a result of what they encounter in their individual lives -- examples being learning and inventing languages, games, art forms, branches of mathematics, types of shelter, and many more. This diversity of content, and the diversity of mixtures of interacting genetic, developmental and learning mechanisms was discussed in more detail in two papers written with Jackie Chappell, one published in 2005 and an elaborated version in 2007. There are complicated relationships with the ideas of AK-S, which still need to be sorted out.

      Tarskian model theory http://plato.stanford.edu/entries/model-theory/ is also relevant. Several computer scientists have developed theories about theories that should be relevant to clarifying some of these issues, e.g. Goguen, Burstall and others (for example, see http://en.wikipedia.org/wiki/Institution_(computer_science).

      At some future time I need to investigate the relationships. However, I don't know whether they include domains that allow (continuous representations of) continuous changes, essential in Euclidean geometry, Newtonian mechanics, and some aspects of biology.

      I don't know if anyone has good theories about discovery, creation, combination, and uses of domains in more or less intelligent agents, including a distinction between having behavioural competence within a domain, having a generative grasp of the domain, and having meta-cognitive knowledge about that competence. These distinctions are important in the work of AK-S, though she doesn't always use the same terminology.

      The rest of this discussion note presents a scruffy collection of examples of domains relevant to what human toddlers (and some other animals and older humans) are capable of learning and doing in various sorts of domains whose instances they interact with, either physically or intellectually. The section on Learning about numbers (Numerosity, cardinality, order, etc.) includes examples of interconnected domains, though not all the relationships are spelled out here.

      Theorems about domains are of many kinds. Often they are about invariants of a set of possible configurations or processes within a domain (e.g. "the motion at the far end of a lever is always smaller than the motion at the near end if the pivot is nearer the far end", "moving towards an open doorway increases what is visible through the doorway, and moving away decreases what is visible"). (See the section on epistemic affordances, below.)

      We need a more developed theory about the types of theorems available to toddlers and others to discover, when exploring various kinds of environment, and about the information-processing mechanisms that produce what AK-S calls "representational redescription" allowing the theorems to be discovered and deployed. (I think architectural changes are needed in many cases.)


    CONTENTS

  5. BASICS OF THE THEORY
    Core ideas (no claims are made here about novelty):

    • Transitions in information-processing
      There are many transitions in living systems, both continuous and discrete, on various scales: within organisms, within a species, within ecosystems, within societies, or sub-cultures, etc. The obvious transitions include physical morphology and observable behaviours.

      There are also transitions in information-processing capabilities and mechanisms that are much harder to detect, though their consequences may include observable behaviours.

      A draft (incomplete, messy and growing) list of transitions in biological information processing is here.

      The transitions producing new capabilities and mechanisms are examples of a generalised concept of morphogenesis, originally restricted to transitions producing physical structures and properties.

      Among the transitions are changes in the mechanisms for producing morphogenesis. These are examples of meta-morphogenesis (MM). The examples of information processing competence described here may occur at various stages during the lives of individuals. The mechanisms that produce new ways of acquiring or extending competences are mechanisms of meta-morphogenesis, about which little is known. Piaget identified many of the transitions in children he observed, and thought that qualitative changes in competence producing competences were global, occurring in succession, at different ages, during the development of a child. Karmiloff-Smith, in Beyond Modularity suggests that transitions between stages may occur within different domains of competence, and will often be more a function of the nature of the domain than the age of the child, though she allowed that there are also some age-related changes. See http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html

      I have no idea what Karmiloff-Smith would think of my proposal to extend this idea to regarding biological evolution (i.e. natural selection) as (unwittingly) making discoveries about domains of mathematical structures then transforming those discoveries in various ways, as outlined in a separate document on the nature of mathematics and the relevance of mathematical domains to evolution and in a presentation to the PT-AI 2013 conference: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/bio-math-phil.html http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk108

      Mary Leng has made related claims related to my topic, but disagreeing with my claims, as reported in this book review: http://www.ams.org/notices/201305/rnoti-p592.pdf

      Evol to Betty

      Transitions occur across species, within a species, within an individual, concurrently in different species, and in some cases in eco-systems or sub-systems involving more than one species.

      blocks

      A draft (growing) list of significant transitions in types of information-processing in organisms is here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/evolution-info-transitions.html

    • It can be very hard to detect or characterise changes in INFORMATION PROCESSING CAPABILITIES, e.g. functions, mechanisms, forms of representation, architectures, ontologies used, ....

      People who have not designed, tested or debugged working systems may lack the concepts and theories required.

    • Turing's idea (large structures from small) -- illustrated in several areas:
      • If the right kinds of small pieces are put together in the right kinds of ways
        --- then qualitatively new structures and behaviours can emerge from their interactions.
        --- e.g. micro manipulations add up to proofs of mathematical theorems
      • The meta-morphogenesis project attempts to apply that idea to varieties of information processing.
      • Turing's most famous work focused on intrinsic information processing: E.g. operations in a Turing machine not connected to anything else
      • To study biological information processing we need to think about connections with an environment

    • Exploration-based learning
      Children and other animals do a lot of empirical exploration of their environment. The kind of exploration depends on the species, is very much influenced by what's in the environment (e.g. including clothing and toys), and also changes with age and cognitive sophistication. It may also be partly influenced by the individual's genetic endowment.

      Exploration here does not necessarily refer to geographical exploration. It can include investigating the space of possible actions on some object or type of object, e.g. things that can be done with sand, with water, with wooden blocks, with string, with paper, with diagrams, etc. [See Sauvy and Sauvy(1974).]

  6. Architecture-Based motivation
    Many researchers, including many (or most?) robotics researchers, believe that it is impossible to have a motive, to want to do or achieve, or prevent, or preserve something in the environment, or in thought, unless achieving that motive produces another effect which is providing a reward, which is usually a scalar quantity so that it can vary in one dimension, with the effect of increasing or reducing the probability that some preceding action will be repeated in similar circumstances. It is normally assumed that without some expected reward an animal or intelligent machine cannot possibly want to do something. (This is also an old debate in philosophy, e.g. see G.E.M.Anscome Intention 1957.)

    I (and probably others using different terminology) have proposed that although rewards of many kinds (including non-scalar rewards) can be important, there are also non-reward-based forms of motivation, without which a great deal of the learning done by young children (and other animals) would be impossible. That's because the learner is required to select things to do without being in a position to have any knowledge about the possible outcomes. So natural selection has somehow provided motivation triggers that are directly activated by perceived states of affairs or processes, or in some cases thoughts, to create motives, which then may or may not produce behaviours, depending on which other motives are currently active, and other factors. Such a mechanism can produce forms of exploration-based learning that would otherwise not occur. I call that "architecture-based motivation" in contrast with reward-based motivation, as explained in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/architecture-based-motivation.html

    The diagram illustrates, schematically, a very simple architecture with motives triggered by what is perceived, but with no computation of, or comparison of, rewards, or expected utility.

    motivation

    In particular, the individual may be unaware of what is being done or why it is being done.

    I am not saying that that's a model of human or animal motive-generation, but that something with those features could usefully be an important part of a motive generation mechanisms if the genetically determined motive generating reflexes are selected (by evolution) for their later usefulness in ways that the individual cannot understand. This idea was independently developed and tested in a working computer model, reported by Emre Ugur (2010).

  7. More on domains (introduced above) in learning
    In doing that exploration, individuals somehow divide up the world into (nested and overlapping) "domains" or "micro-domains", each containing some collection of (relatively) simple entities, properties, relationships, and more complex structures formed from such entities, and also simple processes in which objects change properties and relationships, along with more complex processes created by combining simpler processes, so that new structures are built, old structures disassembled, or multiple relationships changed in parallel. "Multi-strand processes", involve parallel changes in "multi-strand relationships".

    As an individual's competence grows the amount of stored information about each domain grows, extending the variety and complexity of situations they can cope with (e.g. predicting what will happen, deciding what to do to achieve a goal, understanding why something happens, preventing unwanted side-effects, reducing the difficulty of the task, etc.)

  8. On-line vs Off-line intelligence
    Many animals can learn to manipulate objects, using on-line intelligence. A dog can learn to catch a thrown ball, a dolphin can learn to balance a ball on its nose, and many birds seem to be able to learn to build nests (e.g. a young male bowerbird tries copying nests built by an older male). The performance of such tasks uses "on-line intelligence" controlling actions either ballistically or using visual or proprioceptive or haptic servo-control. There are now many AI/Robotics research labs in which robots learn through repeated attempts with some sort of feedback from successes and failures to shape their behaviours to fit the requirements of behavioural task. This work usually assumes that states of the system, perceptual contents, actions, goal states, and in some cases rewards, can all be expressed as numbers or collections of numbers, as opposed, for example, to descriptions of relationships, e.g. "keeping the baby within my field of view" or "preventing the dog's lead wrapping round my legs".

    Within this framework of behaviour-centred learning much interesting research has been done, and there have been many impressive advances that generalise what can be learnt or speed up what can be learnt, or make what has been learnt more robust.

    But I want to raise the question whether this kind of research sheds much light on human intelligence or the intelligence of many other animals with which we can interact, or helps much with the long term practical goals of AI or explanatory goals of AI as the new science of mind. The main problem is all this online intelligence leaves out what can be called "off-line" intelligence, which involves a host of ways of doing something about possible actions other than performing the actions, for example thinking about "what would have happened if...." or explaining why something happened, or why something was not done, or teaching someone else to perform a task, or changing the environment so as to make an action easier, or safer, or more reliable. These abilities seem to be closely related to the abilities of humans to do mathematics, including for example discovering theorems and proofs in Euclidean geometry, which our ancestors must have done originally without any teachers, and without using the translation of geometry into arithmetic that is now required for geometrical theorems to be proved by computer (in most cases).

    A subset of species, including young children and apparently some corvids seem to have the additional ability to think about and reason about actions that are possible but are not currently being performed. This can sometimes lead to the ability to reflect on what went wrong, and how faulty performance might be improved, or failure produced deliberately, and in some cases the ability to understand successes and failures of others, which can be important for teachers or trainers. For example, a mother (or 'aunt'?) elephant seeing a baby elephant struggling unsuccessfully to climb up the wall of a mud bath may realise that scraping some of the mud away in front of the baby will make an easier ramp for the baby to walk up, apparently using "counterfactual" reasoning, as required for a designer or planner. A monkey or ape may be able to work out that if a bush is between him and the alpha male when he approaches a female his action will not be detected.

    For example, a child who has learnt to catch a fairly large ball may be able to think about what will happen if she does not open out her palms or fingers before the ball makes contact with her. And she may also be able to think about what will happen if she does not bring her fingers together immediately after the ball makes contact with her two open palms.

    This uses "off-line" intelligence. More is said about this distinction in Sloman 1982, Sloman 1989, Sloman 1996, Sloman 2006 Sloman 20011

    The differences between on-line and off-line intelligence are sometimes misconstrued, leading to poor theories of the functions of vision -- e.g. the theory that different neural streams are used for "where" vs "what" processing, and the theory of "mirror neurons", neither of which will be discussed further here. For more detail see (Sloman 1982) and the related papers below.

    On-line and off-line intelligence are sometimes combined, e.g. when possible future contingencies are being considered during the performance of an action, or a partly successful action is not interrupted, but while it is continued the agent may be reflecting on what had previously gone wrong and how to prevent it in future.

    Many complex actions, such as nest building, hunting intelligent prey, climbing a tree, eating a prickly pear while avoiding thorns (See Richard Byrne) or constructing a shelter or house require a mixture of on-line and off-line intelligence, often in parallel or alternating performances.

    See also the comments about Karen Adolph's work on learning in infants and toddlers below.

  9. Transformation from learnt reusable patterns to a "proto-deductive" system, possibly including "Toddler Theorems".
    For some domains, after the information acquired (by animal, child, or adult exploring a new domain, or possible future robot) has reached a certain kind of complexity, powerful cognitive mechanisms somehow transform that information into a more systematic form so that there is a core of knowledge from which everything else learned about the domain can be derived, along with a great deal more -- so that the learner is then able to cope with novel situations. This requires something like the replacement of a collection of exemplars or re-usable patterns with a proto-deductive system. This term is not intended to imply that logic and logical deduction are used.

    The main consequence is that the learner can now work out things that previously had to be learnt empirically, or picked up from teachers, etc. This means that the realm of competence is enormously expanded.

    This requires the use of information structures of variable complexity composed of components that can be re-used in novel structures with (context-sensitive) compositional semantics -- one reason why internal languages had to evolve before languages used for communication.

    N.B. This is totally different from building something like a Bayes Net storing learnt correlations and allowing probability inferences to be made.
    Bayesian inference produces probabilities for various already known possibilities. What I am talking about allows new possibilities and impossibilities to be derived, but often without any associated probability information: if a polygon has three sides then its angles must add up to half a rotation.

    Compare using a grammar to prove that certain sentences are possible and others impossible. That provides no probabilistic information. In fact a very high proportion of linguistic utterances had zero or close to zero probability before they were produced. But that does not prevent them being constructed if needed, or understood if constructed.

    The same can be said about possible physical structures and processes. Before the first bicycle was constructed by a clever designer, the probability of it being constructed was approximately zero.

  10. A conjecture about (some) toddler theorems
    (An idea still to be fleshed out.)

    In the case of logical reasoning it is possible to make discoveries about which classes of inference are valid by starting from examples, then generalising, then discovering (in ways that are not yet clear) that the generalisation cannot have counter-examples (e.g. by reasoning about "typical" instances that have all the relevant features).

    For non-logical reasoning, e.g. reasoning about transformations of a set of topological or geometric relationships, similar processes of reasoning without performing physical actions can provide new knowledge of about possibilities and necessities.

    Kenneth Craik, Philip Johnson-Laird and others have suggested that internal models can be used for making predictions about possible actions http://en.wikipedia.org/wiki/Mental_model However most of them fail to notice the differences between being able to work out "what will happen if X occurs" and being able to reason about about what is and is not impossible, or what else will necessarily occur if X occurs.

    Examples of discovering what is impossible are discussed in
    http://www.cs.bham.ac.uk/research/projects/cogaff/misc/impossible.html

    • The learner discovers various ways of characterising structures and processes.

    • Processes that alter a structure, or which modify a process (e.g. initiating, or terminating, or speeding up or slowing down, or changing direction of, some motion or rotation) can also be represented though that may require a more sophisticated and abstract form of representation.

    • For purposes of performing similar actions in different contexts, schematic versions of the actions may be useful: e.g. if two opposed flat surfaces with an object between them move together, then their continued motion will be interrupted before they are in contact. This abstraction might be expressed in a form of representation used to control grasping in a wide variety of situations. See http://www.cs.bham.ac.uk/research/projects/cogaff/misc/grasping-grasping.html

    • Later, some learners discover that it is possible to select and evaluate plans for sequences of actions by combining such abstract representations, omitting the actual parameters required to instantiate the actions.

    • This allows reasoning about future actions to be performed in the abstract, the result being a plan that can be executed by inserting the parameters.

    • Alternatively a composite action may be performed, and because it was successful it may be recorded as a schematic composite action (a re-usable plan) with some of the details replaced by "gaps" to be filled whenever the plan is use. (This idea is an old one in the symbolic planning community -- e.g. Strips, Abstrips, etc.)

    • Later the learner can discover that in addition to running an abstract plan by filling its gaps (instantiating its variables), the learner can run the plan schematically in different contexts and discover interactions: e.g. you can have all the conditions for grasping something yet the attempt to grasp fails because there is some additional object between the grasping surfaces that is larger than the object to be grasped. This can be discovered without actually performing the operation in a physical situation -- merely "running" a schematic simulation. It does not need to have any specific parameters for the sizes and distances. This is not to be confused with performing an inference using probabilities.

    The key idea is that under some conditions it is possible to discover that properties of a schematic structure or schematic process are invariant -- i.e. the properties do not depend on the precise instantiation of the abstraction, though sometimes it is necessary to add previously unnoticed conditions (e.g. no larger object is between the grasping surfaces) for a generalisation to be true.

    This idea will have to be fleshed out very differently for different domains of structures and processes, or for different sub-domains of rich domains -- e.g. Euclidean geometry, operations on the natural numbers. (See examples about counting below.)

    The kinds of discoveries discussed here are not empirical discoveries, but that does not mean that the reasoning processes are infallible. The history of mathematics (e.g. the work of Lakatos below) shows that even brilliant mathematicians can fail to notice special cases, or implicit assumptions. Nevertheless I think these ideas if fleshed out would support Kant's ideas about the nature of mathematical discoveries, as discoveries of synthetic necessary truths. (As far as I know, he did not notice that the discovery processes could be fallible.)

    The ideas in this section are elaborations of some of the ideas in Chappell and Sloman (2007). ___________________________________________________________________________________

  11. Alternative forms of representation
    I have argued in the past that there are alternative forms of representation that can be used for reasoning, and modelling causal interactions.

  12. The kind of proto-deductive system a human toddler can produce -- or a squirrel, or orangutan or a nest-building bird -- seems unlikely to use the kinds of deduction logicians understand well, based on propositional and predicate calculus, so a major research problem is to investigate alternative forms of representation. Jackie Chappell and I have presented some draft ideas about requirements for those alternative forms of representation, used for perception, for planning, for plan-execution, for making predictions, for enabling internal explanations (e.g. how something happens, how something works).

    This is deeply connected with a Kantian theory of causation. See our 2007 'WONAC' presentations http://www.cs.bham.ac.uk/research/projects/cogaff/talks/wonac/.

    [Added 27 Oct 2011]
    It is also connected with our discussion of "internal" precursors to the use of language for communication -- in pre-verbal humans, in pre-human ancestors and in other species. E.g. see Sloman Talk52 on Evolution of minds and languages.

  13. Not much is currently known about the mechanisms that acquire and use the information initially, or how the transformations occur, or what the new forms of representation are, nor whether changes of architecture are also required. However in the case of language learning it is known that the transformation to a proto-deductive system (using a grammar/syntax) produces errors because natural languages (unlike Euclidean geometry, Newtonian mechanics, etc.) have many exceptions. Dealing with the exceptions obviously requires a further architectural change, which is a non-trivial process.

    If we treat language learning as a special case of something more general, found also in pre-verbal children and in other species that can see, think, plan, predict, and control their actions sensibly, that may give us new clues as to the nature of language learning.

  14. A more detailed analysis than I can present here would subdivide the learning and developmental processes into far more distinct categories, concerned with different domains of information, including:

    • Spatial structures and processes perceived in the environment;
    • Spatial structures and processes created, changed, or manipulated by the perceiver;
    • Different collections of properties and relationships, including metrical, semi-metrical, topological, properties and relationships are involved in different domains.
    • Some kinds of processes involve not just physical changes, but also purposes, information, knowledge, attempts to achieve, successes and failures, and various kinds of learning. Perceiving, characterising, or thinking about such topics requires specific forms of representation and specific types of content to be represented: using meta-semantic competences, for representing and reasoning about things that themselves represent and reason.
    • An individual that applies such modes of reasoning to its own competences and uses of its competences, the individual can be said to be developing auto-meta-semantic competences.

  15. Ontologically conservative and non-conservative transitions

    It may be useful to distinguish

    • Ontologically conservative transitions
      These are reorganisations into deductive systems that do not extend the ontology previously available -- so the same forms of representation suffice, and no new types of entity are referred to, though new inferences may be possible because of the greater generality of the "axioms" (or their analogues) of the deductive system, compared with the previously acquired empirical knowledge.
      (Example to be added)

    • Ontologically non-conservative ("ampliative") transitions reorganisations that introduce new entities and new symbols (or new forms of representation) to refer to the new entities.

    • Somatic and exo-somatic ontologies/forms of representation
      In some cases, the new entities may be postulated as hidden parts of the previously known types of entity, as happens in many theoretical advances in science, e.g. adding atomic theory to early physics and chemistry, then adding new kinds of sub-atomic particles, properties, relationships.

      In other cases, the new entities postulated are not contained in the old ones, for example, when an organism that initially has sensory and motor signals and seeks regularities in recorded relationships, including co-occurrences and temporal transitions, later adds to the ontology additional objects that are not parts of the available signals but are postulated to exist in another space, which can have (possibly changing) projections into the sensory space. One extremely important example of this would be extending the ontology to include objects that exist independently of what the organism senses, and which can be sensed in different ways at different times. The former is a somatic ontology, the latter an exosomatic ontology.

      An example, going from sensory information in a 2-D discrete retina to assumed continuously moving lines sampled by the retina, or even a 3-D structure (e.g. rotating wire-frame cube) projecting onto the retina, is discussed in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/simplicity-ontology.html

      Ontologically non-conservative transitions refute the philosophical theory of concept empiricism (previously refuted by Immanuel Kant), and also demolish symbol-grounding theory, despite its popularity among researchers in AI and cognitive science.

      They also defeat forms of data-mining that look for useful new concepts (or features) that are defined in terms of the pre-existing concepts or features used in presenting the data to be learnt from. (Some work by Stephen Muggleton, using Inductive Logic Programming may be an exception to this, if some of the concepts used to express new abduced hypotheses, are neither included in nor definable in terms of some initial subset of symbols.)

    • Ontologically potentially non-conservative ("abstractive") transitions
      Sometimes the extension of an ontology involves introducing a new type or relationship or operator that is an abstraction from previously used examples. For example, a mathematician who notices properties common to addition and multiplication can introduce the notion of a group, which is a collection of entities and a function from a pair of entities in the collection to an entity in the collection, where the function satisfies some conditions, e.g. it has an identity, an inverse and is associative, etc.

      It is easy to see that integers (though not just positive integers) with addition, and also rational numbers, both form groups.

    • Ontology formation by abstraction
      Abstracting from a particular domain to introduce a new concept, like group, does not imply that any other instances of the concept exist. But that does not mean that the concept "group" is defined in terms of the cases from which it was abstracted.

      That's because it is possible to discover later that some newly discovered mathematical structure is a group, e.g. a set of translations of 3-D structures, with composition as the group operator..

      Many mathematical abstractions go beyond the exemplars that led to their discovery. In fact the discovery may be triggered by relatively simple cases that are much less interesting than cases discovered later. The initial cases that inspired the abstraction may be completely forgotten and perhaps not even mentioned in future teaching of mathematics.

      This use of abstraction in mathematics is often confused with use of metaphor. Unlike use of abstraction, use of metaphor requires the original cases to be retained and constantly referred to when referring to new cases, whereas an abstraction can float free of the instances that triggered its discovery.

    • There's much, much more to be said about all these topics. Some of these processes were modelled nearly 40 years ago by Gerry Sussman in his HACKER system, for his MIT PhD thesis, later published as a book. G.J. Sussman, A computational model of skill acquisition, American Elsevier, 1975, http://dspace.mit.edu/handle/1721.1/6894
      There's a useful summary of his work in Margaret Boden's 1978 book: Artificial Intelligence and Natural Man, Harvester Press, Second edition 1986. MIT Press,

  16. The leading researcher into these processes, among psychologists and neuroscientists, seems to be Annette Karmiloff-Smith. I have a personal (and still incomplete) summary and review of her work here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html

  17. As far as I know, the task of replicating such processes in robots is beyond the current state of the art in AI (except perhaps in 'toy' domains). We'll need to find new forms of representation, and new mechanisms for reorganising information in ways that produce powerful new ontologies and new representations. Perhaps this can build on the theory of construction-kits sketched in another document:
    http://www.cs.bham.ac.uk/research/projects/cogaff/misc/construction-kits.html

    Some of the problems are discussed in more detail in

  18. http://jackiechappell.com/news/tecwyn-anim-cogn-2011.html Jackie Chappell, Cognitive strategies in orangutans (2011).

CONTENTS

Added 7 Aug 2013: ROBERT LAWLER'S VIDEO ARCHIVE
Bob Lawler has generously made available a large collection of video recordings of three children over many years here: http://nlcsa.net/

I have not yet had time to explore the videos in any detail, but I expect there are many examples relevant to the processes and mechanisms involved in discovery of toddler theorems.

The first video I selected at random

    http://nlcsa.net/lc1a-nls/lc1a-video/ "Under Arrest"
illustrated many different things simultaneously, including how two part-built information processing architectures at very different stages of construction, with an adult out of sight, could interact in very rich ways with each other, some physical some social, and to a lesser extent with the adult through verbal communication. The older child clearly has both a much richer repertoire of spatial actions and a much richer understanding of the consequences of those actions. He also has some understanding of the information processing of the other child, including being able to work out where to go in order to move out of sight of the younger child. However the younger child does not forget about him when he is out of sight but is easily able (thanks to the help of a wheeled 'walker') to alter her orientation to get him back in view.

How a child moves from the earlier set of competences to the later set, is a question that can only be answered when we have a good theory of what sorts of information processing architectures are possible, and how they can modify themselves by building new layers of competence, in the process of interacting with a rich environment -- partly, though not entirely, under the control of the genome, as outlined in Chappell & Sloman 2007).

The ability to be able to model such transitions in robots is still far beyond our horizon, despite all the shallow demonstrations of 'progress' in robot training scenarios.


Kinds of dynamical system:
Moved to a separate file (10 Aug 2012)
Replaced by a more up to date version:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/multipic-challenge.pdf
A Multi-picture Challenge for Theories of Vision Including a section on types of dynamical system relevant to cognition.
CONTENTS

Some relevant presentations and papers

Example presentations and papers on this this topic written over the last 50 years,
especially since the early 1990s.

PRESENTATIONS (PDF)

OTHER REFERENCES
(To be expanded)

(There's a great deal more to be added here, by many different sorts of researchers.)
____________________________________________________________________________

Maintained by Aaron Sloman
School of Computer Science
The University of Birmingham

____________________________________________________________________________






























.
]]