This web page is
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html
Or: http://goo.gl/QgZU1g
A messy automatically generated PDF version of this file (which may be out
of date) is:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.pdf
This is one of a set of documents on meta-morphogenesis, listed in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-morphogenesis.html
A partial index of discussion notes is in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/AREADME.html
______________________________________________________________________________
Such meta-cognitive layers, allowing what is and is not known to be noticed and thought about, may not be available at birth, but seem to develop later on (at various stages) in normal humans but not in other animals, though some other animals may have partial forms. But crucial proto-theorems may be discovered without such meta-meta-cognition, and used, unwittingly, and often without being noticed by doting parents and researchers.
The claim is that even pre-verbal toddlers can make discoveries about what is and is not possible in various situations, and put those discoveries to use, but without knowing they are doing that. This is a deeper and, for humans, more important ability than the ability to acquire statistics-based abilities to predict what is very likely or very unlikely. Sets of possibilities are logically, metaphysically, and cognitively prior to probabilities -- a claim that will be discussed in another document later.
A core hypothesis is that there are important forms of learning that involve being able to discover sets of possibilities (Piaget 1981) inherent in a situation and their constraints or necessary connections (Piaget 1983). This is a much deeper aspect of intelligent cognition than discovery of correlations, as in reinforcement learning, e.g. using Bayesian nets. (Here's a simple tutorial Bayes Nets.)
Example: The drawer-shutting theorem
He would put both hands on the rim of the open drawer and push: OUCH!
Eventually he discovered a different way that avoided the pain.
If you push a close-fitting drawer shut with your fingers curled over the top edge your fingers will be squashed, because, although it is possible for the open-drawer to be pushed towards the shut position, it is impossible for it to avoid squashing the curled fingers (if they stay curled during pushing.)
On which hand will the fingers be squashed when the drawer is pushed shut?
(Figure added 14 Oct 2014)
(Apologies for low quality art.)
What sorts of representational, architectural, and reasoning (information manipulation) capabilities could enable a child to work out
The answer seems to have two main aspects, one non-empirical, to do with consequences of surfaces moving towards each other with and without some object between them, and the other an empirical discovery about relationships between compression of, or impact on, a body part and pain or other experiences.
A sign that the child has discovered a theorem derived in a generative system, may be the ability to deal with other cases that have similar mathematical structures, despite physical and perceptual differences, e.g. avoiding trying to shut a door by grasping its vertical edge, without first trying it out and discovering the painful consequence.
Perceiving the commonality between what happens to the edge of a door as it is shut (a rotation about a vertical axis) and what happens to the edge of a drawer when it is shut (a translation in a horizontal plane) seems to require the ability to use an ontology that goes beyond sensory-motor patterns in the brain, and refers to structures and processes in the environment: an exosomatic ontology.
Once learnt, the key facts can be abstracted from drawers and horizontal edges and applied to very different situations where two surfaces move together with something in between, e.g. a vertical door edge. As Immanuel Kant pointed out in 1781, the mathematical discoveries may initially be triggered by experience, but that does not make what is learnt empirical, unlike, for example, learning that pushing a light switch down turns on a light. No matter how many examples are found without exceptions this does not reveal a necessary connection between the two events. Learning about electrical circuits can transform that knowledge, however.
There seem to be many different domains in which young children can acquire perceptual and cognitive abilities, later followed by development of meta-cognitive discoveries about what has previously been learnt, often resulting in something deeper, more general, and more powerful than the results of empirical learning. The best known example is the transition in young children from pattern-based language use to grammar based use, usually followed by a later transition to accommodate exceptions to the grammar. Like Annette Karmiloff-Smith, whose ideas about 'representational re-description' are mentioned below, I think this sort of transition (not always followed by an extension to deal with counter examples) happens in connection with many different domains in which children (and other animals) gain expertise. Moreover, as proposed in a theory developed with Jackie Chappell (2007)) and illustrated below in Figure Evo-Devo, this requires powerful support from the genome, at various stages during individual development.
The mathematical and proto-mathematical learning discussed in this document cannot be explained by the statistical mechanisms for acquiring probabilistic information now widely discussed and used in AI, Robotics, psychology and neuroscience. Evolution discovered something far more powerful, which we do not yet understand. Some philosophers think all mathematical discoveries are based on use of logic, but many examples of geometrical and topological reasoning cannot be expressed in logic, and in any case were reported in Euclid's Elements over two thousand years ago, long before the powerful forms of modern logic had been invented by Frege and others in the 19th Century. I'll make some suggestions about mechanisms later. Building and testing suitable working models will require major advances in Artificial Intelligence with deep implications for neuroscience and philosophy of mathematics.
Re-formulating an empirical discovery into a discovery of an impossibility or a necessary connection is sometimes more difficult than the drawer case (e.g. you can't arrange 11 blocks into an NxM regular array of blocks, with N and M both greater than 1 -- why not?). Different mechanisms may have evolved at different stages, and perhaps in different species, for making proto-mathematical discoveries. Transformations of empirical discoveries into a kind of mathematical understanding probably happens far more often than anyone has noticed, and probably take more different forms than anyone has noticed. They seem to be special subsets of what Annette Karmiloff-Smith calls "Representational Redescription", also investigated by Jean Piaget in his last two books, on Possibility and Necessity.
Proto-mathematical understanding may be acquired and used without the learner being aware of what's happening. Later on, various levels and types of meta-cognitive competence develop, including the ability to think about, talk about, ask questions about and in some cases also to teach others what the individual has learnt. All of this depends on forms information processing "discovered" long ago by the mechanisms of biological evolution but not yet understood by scientists and philosophers of mathematics, even though they use the mechanisms. Arguments that languages and forms of reasoning must have evolved initially for internal, "private", use rather than for communication can be found in Talk 111.
The aim of this document is mainly to collect examples to be found during development of young children. Discussions of more complex examples, and requirements for explanatory mechanisms, can be found in other documents on this web site. This one of many strands in the Meta-Morphogenesis project.
Added 18 Jun 2014:
One problem for this research is that it can't be done by most academic
developmental psychologists because the research requires detailed, extended,
observation of individuals, not in order to discover regularities in child
cognition and development, but in order to discover what sorts of capabilities
and changes in capabilities can occur. This is a first step to finding
out what sorts of mechanisms can explain how those capabilities and
changes are possible (using the methodology in chapter 2
of Sloman (1978), expanded in
this document
on explaining possibilities. This requires the researchers to
have kinds of model-building expertise that are not usually taught in psychology
degrees. (There are some exceptions, though often the modelling tools used
are not up to the task, e.g. if the tools are designed for numerical
modelling and the subject matter requires symbolic modelling.)
This is not regarded as scientific research by a profession many of whose members believe (mistakenly) the Popperian myth that the only reportable scientific results in psychology must be regularities observed across members of a population, and where perfect regularities don't exist because individuals differ, then changes in averages and other statistics should be reported.
In part that narrow, unscientific mode of thinking is based on a partial understanding of the emphasis on falsifiability in Karl Popper's philosophy of science, which has done a lot of harm in science education. What is important in Popper's work is the idea that explanatory theories should have consequences that are as precise and general as possible. But they may not be falsifiable for a long time because the theory does not entail regularities in observables, and does not make predictions about all or even some proportion of learners.
Instead it may successfully guide searches for new, previously unnoticed, types of example covered by those possibilities. A later development of the theory could provide suggestions regarding explanatory mechanisms. For such mechanisms it is more important to produce working models demonstrating the potential of the theory than to use the theory to make predictions. Such research sometimes gains more from detailed long term study of individuals, and speculative model building and testing, than from collection of shallow data from large samples.
For more on the scientific importance of theories explaining how something is possible see
Teaching based on a deep theory may make a huge difference to the performance of a small subset of high ability learners even if the theory does not specify how those learners can be identified in advance as a basis for making predictions. Moreover the theory may explain the possibility of a variety of developmental trajectories that can be observed by good researchers when they occur, though theory may not (yet) give clues as to which individuals will follow which trajectories. Many biological theories have that form, e.g. explaining how some developmental abnormalities can arise without being rich enough to predict which individual cases will arise. In some cases that may be impossible in principle if the abnormalities depend on random chemical or metabolic co-occurrences during development about which little is known.
A theory explaining how sophisticated mathematical competences can
develop may make no falsifiable predictions because there are no regularities --
especially with current teaching of mathematics in most schools. (Unless I've
been misinformed.)
[27 Sep 2014: To be expanded, including illustrations from linguistic theory.]
One of the bad effects of these fashions is that the only kind of recommendation for educational strategies such a researcher can make to governments and teachers is a recommendation based on evidence about what works for all learners, or, failing that, what works for a substantial majority of learners.
(E.g. a recommendation to teach reading using only the phonic method -- which assumes that the main function of reading is to generate a mapping from text to sounds, building on the prior mapping from sounds to meanings. That recommendation ignores the long term importance of building up direct mappings from text to meanings operating in parallel with the mapping from sounds to meanings, and construction of architectural components not required for reading out loud but important for other activities later on, e.g. inventing stories or hypothetical explanations.)
Another bad effect of the emphasis on discovering and reporting what normally does happen rather than what can happen is to deprive psychology of explanatory theories able to deal with outliers, such as Bach, Mozart, Galileo, Shakespeare, Leibniz, Newton, Einstein, Ramanujan, and others. In contrast, a deep theory about what is possible and how it is possible can account both for what is common and what is uncommon, just as a theory about the grammatical structure of English can explain both common utterances and sentences that are uttered only once, like this one.
A tentative proposal:
The examples of toddler theorem discoveries given below are isolated reports of
phenomena noticed by me and various colleagues, along with cases presented in
text books, news reports or amateur videos on social media. Perhaps this web
site should be augmented with a web site where anyone can post examples, and
where development of individual babies, toddlers and children over minutes,
hours, days, weeks, months or years can be reported. Something like Citizen Science for
developmental psychology? Any offers to set that up?
More examples are presented below.
Some of the components and functions required in animal or robot information processing architectures are crudely depicted and sub-divided in the figure below, where processes and mechanisms at lower levels are generally evolutionarily much older than those at higher levels, and probably develop earlier in each individual, though new ones may be added later through training:
(Recently revised diagram of CogAff Schema, thanks to Dean Petters.)
The architectural ideas are discussed in relation to requirements for virtual
machinery here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/vm-functionalism.html
Including an older version of the Human Cogaff (h-cogaff) diagram, namely
The development of proto-mathematical and mathematical competences listed below make use of mechanisms, including changing mechanisms, in all the layers and columns of mechanisms depicted in the above diagrams. No diagram, however, can adequately represent the richness and diversity of components and the functionality they add. no
Note: 25 Dec 2017
After collecting many examples of competences to be explained, especially the
competences involved in ancient discoveries in geometry and topology, long
before the development of the modern logic-based axiomatic method and use of
Cartesian coordinates to represent geometry, I have begun to explore the
possibility that a kind of "Super-Turing" information processing mechanism must
have been produced by evolution. The ideas will be elaborated in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/super-turing-geom.html
One of the features of a system like this is that if the stages are extended in time, and if the earlier stages include development of abilities to communicate with conspecifics and acquire information from them, then later developments (to the right of the diagram) can be influenced by not only the physical and biological environment as in other altricial species, but also by a culture.
As we see on this planet, that can have good effects, such as allowing cultures
to acquire more and more knowledge and skill, and bad effects such as allowing
religious ideas, cruel practices, superstition, and in some cases "mind-binding"
processes that prevent the full use of human developmental potential, as
discussed in:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/teaching-intelligent-design.html#softwarebug
'Religion as a software bug'
Note:
I hope to show later on how the above model of interactions between genome and
environment in individual members of advanced species can be modified to produce
a partly analogous model of how evolution works within a portion of the physical
universe. Both are examples of dynamical systems with creative powers, able to
transform themselves not merely by adjusting numerical parameters but by
introducing new abstract types of structure and types of causal power, which can
later be instantiated in different ways in different contexts.
This can be seen as partly analogous to abductive reasoning, in which evidence
inspires formulation of a new explanatory hypothesis that is added to previous
theories, in some cases with new undefined symbols that "grow" semantic content
through deployment of the theory.
See
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/meta-configured-genome.html
For example, this 11-month old child is not a toddler, as she cannot yet walk, and has recently learnt to crawl, but she seems to have made a discovery about things that can be supported between upward pointing toes and downward facing palm.
Whether that's a "theorem" for her depends on whether she was able (using whatever representational resources are available to a pre-verbal human) to reason about the consequences of previously acquired information about affordances so as to predict what would happen in this novel situation, or retrospectively to understand why it happens if it first happened unintentionally.
Clearly whatever initiated the processes she continued it intentionally and even seemed to be trying to share what she had discovered with someone not in the picture. The differences between possible cases need further investigation elsewhere. There are also many examples involving actions that produce changes of posture (e.g. from lying on back or belly to sitting upright) and various crawling actions that provide forward or backward motion or change of direction.
As for why children do such things, I believe the normal assumption that all motivation must be reward based is false, as discussed below in the section on Architecture-Based motivation.
Another pre-toddler-theorem in the case of this child seems to be that the transition between
I would be grateful for information about any other infants who
use this or a related method for doing the 90 degree rotation of
torso and roughly 180 degree rotation of legs.
3.b. A crawler's door-closing theorem
(Added 1 Jul 2017) This is based on a recollected episode over a decade ago,
when a baby and his parents were visiting us. At one point he crawled from the
front hall into an adjoining room, indicating that he wanted me to follow him
(e.g. stopping, waiting and looking round at me if I paused while following
him). After he had crawled through the door and waited for me to follow him, he
wanted the door shut. (I have no idea why, perhaps he had no reason.) He managed
to push it shut with his feet, after crawling to an appropriate location,
rolling over onto his back, swinging his legs back round the door, then pushing
shut.
That action can be thought of as a proof (by construction) of the theorem that it is possible to shut a door with your feet after crawling through the doorway.
How was the intention to do all that represented in his brain (or mind) long before he could say anything in words?
Rolls over onto back to push door shut with feet.
At the time, I did not think of asking his parents whether he had been taught to do that, or had regularly been doing it at home. In either case he seemed to understand what he was doing, and was able to manoeuvre into the right position, to get the door shut the first time he tried in our house, which had a very different layout from his home.
What kind of representation of spatial structures, relationships, and
possibilities for change could a child's brain use (a) in forming the intention
to perform such an action, and (b) in actually doing it? I suspect the answer
will refer to precursors to the mechanisms that enabled ancient mathematicians
to make profound mathematical discoveries.
See also:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/impossible.html
"Knowing and Understanding: Relations between meaning and truth, meaning and necessary truth, meaning and synthetic necessary truthhttp://www.cs.bham.ac.uk/research/projects/cogaff/sloman-1962
there are also truths that are neither empirical nor trivial but provide substantial knowledge, namely synthetic, necessary, truths of mathematics, whose discovery requires non-empirical reasoning capabilities.
Some of the concepts used here are explained in this summary of parts of my
DPhil thesis:
"'NECESSARY', 'A PRIORI' AND 'ANALYTIC'" (1965)
http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#1965-02
http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#rog
Functions and Rogators (1965)
http://www.cs.bham.ac.uk/research/projects/cogaff/62-80.html#1968-01
Explaining Logical Necessity (1968-9)
Around 1970 Max Clowes introduced me to Artificial Intelligence, especially AI work on Machine vision. That convinced me that a good way to make progress on my problems might be to build a baby robot that could, after some initial learning about the world and what can happen in it, notice the sorts of possibilities and necessities (constraints on possibilities) that characterise mathematical discoveries. My first ever AI conference paper distinguishing "Fregean" from "Analogical" forms of representation was a start on that project, followed up in my 1978 book, especially Chapters 7 and 8.
- Interactions between philosophy and AI: The role of intuition and non-logical reasoning in intelligence,
Proc 2nd IJCAI, 1971, London, pp. 209--226, http://www.cs.bham.ac.uk/research/cogaff/04.html#200407
- Aaron Sloman, CRP: The Computer Revolution in Philosophy: Philosophy, Science and Models of Mind, Harvester Press (and Humanities Press), 1978, http://www.cs.bham.ac.uk/research/cogaff/62-80.html#crp
From about 1973, I was increasingly involved in AI teaching and research and also had research council funding for a project on machine vision, some results of which are summarised in chapter 9 of CRP. Later work (teaching and research) led me in several directions linking AI, Philosophy, language, forms of representation, architectures, relations between affect and cognition, vision, and robotics. Progress on the project of implementing a baby mathematician was very slow, mainly because the various problems (especially about forms of representation) turned out to be much harder than I had anticipated. Moreover, I did not find anyone else interested in the project.
In 2008 Mary Leng jolted me back into thinking about mathematics by inviting me to give a talk in a series on mathematics at Liverpool University. In that talk and in a collection of subsequent papers and presentations I tried to collect examples and arguments about how various aspects of mathematical competence could be seen to arise out of requirements for interacting with a complex, structured, changeable environment. I did not find anyone else who shared this interest, perhaps because the people I met had not spent five years between the ages of five and ten playing with meccano? http://www.cs.bham.ac.uk/research/projects/cosy/photos/crane/
Added 23 Aug 2012:
Although I started this web page in October 2011, I have been working on many of these
themes for many years using different terminology. E.g. some of the ideas about numbers go
back to chapter 8 of my 1978 book, but that builds on my 1962 Oxford DPhil Thesis
(attempting to defend Kant's philosophy of mathematics -- before I knew anything about
computers or AI).
After discovering the deep overlap with ideas Annette Karmiloff-Smith (AK-S) had developed, especially in her 1992 book, which I have begun to discuss in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html I thought it might be helpful to use her label "domain", instead of the collection of labels I have been playing with over several decades (some of which have been widely used in AI, others in mathematics, software engineering, etc. -- the ideas are deep and pervasive).
I can't now remember all the labels I have used, but the following can be found in some of my papers, talks, and web pages, with and without the hyphens:
'micro-world' 'mini-world' 'micro-domain' 'micro-theory' 'theory' 'framework' 'framework-theory'
What is a domain?
I don't think there is any clear and
simple answer to that question. But this document
presents several examples that differ widely in character, making it clear that domains
come in different shapes and sizes, with different levels of abstraction, different kinds
of complexity, different uses -- both in controlling visible behaviour and in various
internal cognitive functions --, different challenges for a learner, different ways of
being combined with other domains to form new domains, and conversely, different ways
of being divided into sub-domains, etc.
We might try to compare different sub-fields of academic knowledge to come up with an analysis of the concept of domain, but there are many overlaps and many differences between such domains as philosophy, logic, mathematics, physics, chemistry, biology, biochemistry, zoology, botany, psychology, developmental psychology, gerontology, linguistics, history, social geography, political geography, geography, meteorology, astronomy, astrophysics, ....
Moreover within dynamic disciplines new domains or sub-domains often grow, or are discovered or created, some of them found to have pre-existed waiting to be noticed by researchers (e.g. planetary motions, Newtonian mechanics, chemistry, topology, the theory of recursive functions) while others are creations of individual thinkers or groups of thinkers, for example, art forms, professions (carpentry, weaving, knitting, dentistry, physiotherapy, psychotherapy, architecture, various kinds of business management, divorce law in a particular country, jewish theology, and many more). However, that distinction, between pre-existing and human-created domains, is controversial with fuzzy boundaries.
Philosophers' concepts of "natural kinds" are attempts to make some sort of sense of this, in my view largely unsatisfactory, in part because many of the examples are products of biological evolution, and some are products of those products. I suspect the idea of "naturalness" in this context is a red-herring, since the distinction between what is created and what was waiting to be discovered is unclear and there are hybrids. The distinction between "logical geography" (Gilbert Ryle) and "logical topography" (me), is also relevant, explained in http://tinyurl.com/BhamCog/misc/logical-geography.html,
A particularly rich field of human endeavour in which hierarchies of domains are important
is software engineering, and the discovery of this fact has led to the creation of various
kinds of programming languages for specifying either individual domains or families of
domains. For example, so-called "Object Oriented Programming" introduced notions of
classes, sub-classes, instances, and associated methods (class-specific algorithms) and
inheritance mechanisms. More sophisticated OOP languages allowed multiple inheritance and
generic functions (methods that are applicable to collections of things of different types
and behaviour in ways that depend on what those types are).
http://tinyurl.com/PopLog/teach/oop
Note added 4 Mar 2015
Using the notion of construction kit presented in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/construction-kits.html
we can say that many domains are "generated" or "defined" by a particular type
of construction kit (which may be composed of simpler construction kits). We
need a more thorough survey and analysis of cases.
More generally we can say that a domain involves relationships that can hold between types of thing, and instances of those types can have various properties and can be combined in various ways to produce new things whose properties, relationships, competences and behaviours, depend on what they are composed of and how they are combined, and sometimes the context. Often mathematicians specify such domain-types without knowing (or caring) whether instances of those types actually existed in advance (e.g. David Hilbert's infinite dimensional vector spaces?) Additional domains are summarised below.
Formation of a new instance of a type in a domain can include assembling pre-existing instances to create larger items (e.g. joining words, sentences, lego bricks, meccano parts dance steps, building materials, mathematical derivations), or can include inserting new entities within an existing structure, or changing properties, or altering relationships. E.g. loosening a screw in a meccano crane can sometimes introduce a new rotational degree of freedom for a part.
Some domains allow continuous change, e.g. growth, linear motion, rotation, bending, twisting, moving closer, altering an angle, increasing or decreasing overlap, changing alignment, getting louder, changing timbre, changing colour, and many more (e.g. try watching clouds, fast running rivers, kittens playing, ...). Some allow only discrete changes, e.g. construction of logical or algebraic formulae, or formal derivations, operations in a digital computer, operations in most computational virtual machines (e.g. a Java or lisp virtual machine), some social relations (e.g. being married to, being a client of,), etc.
The world of a human child presents a huge variety of very different sorts of domains to be explored, created, modified, disassembled, recombined, and used in many practical applications. This is also true of many other animals. Some species come with a fixed, genetically determined, collection of domain related competences, while others have fixed frameworks that can be instantiated differently by individuals, according to what sorts of instances are in the environment, whereas humans and others (often called "altricial" species) have mechanisms for extending their frameworks as a result of what they encounter in their individual lives -- examples being learning and inventing languages, games, art forms, branches of mathematics, types of shelter, and many more. This diversity of content, and the diversity of mixtures of interacting genetic, developmental and learning mechanisms was discussed in more detail in two papers written with Jackie Chappell, one published in 2005 and an elaborated version in 2007. There are complicated relationships with the ideas of AK-S, which still need to be sorted out.
Tarskian model theory http://plato.stanford.edu/entries/model-theory/ is also relevant. Several computer scientists have developed theories about theories that should be relevant to clarifying some of these issues, e.g. Goguen, Burstall and others (for example, see http://en.wikipedia.org/wiki/Institution_(computer_science).
At some future time I need to investigate the relationships. However, I don't know whether they include domains that allow (continuous representations of) continuous changes, essential in Euclidean geometry, Newtonian mechanics, and some aspects of biology.
I don't know if anyone has good theories about discovery, creation, combination, and uses of domains in more or less intelligent agents, including a distinction between having behavioural competence within a domain, having a generative grasp of the domain, and having meta-cognitive knowledge about that competence. These distinctions are important in the work of AK-S, though she doesn't always use the same terminology.
The rest of this discussion note presents a scruffy collection of examples of domains relevant to what human toddlers (and some other animals and older humans) are capable of learning and doing in various sorts of domains whose instances they interact with, either physically or intellectually. The section on Learning about numbers (Numerosity, cardinality, order, etc.) includes examples of interconnected domains, though not all the relationships are spelled out here.
Theorems about domains are of many kinds. Often they are about invariants of a set of possible configurations or processes within a domain (e.g. "the motion at the far end of a lever is always smaller than the motion at the near end if the pivot is nearer the far end", "moving towards an open doorway increases what is visible through the doorway, and moving away decreases what is visible"). (See the section on epistemic affordances, below.)
We need a more developed theory about the types of theorems available to toddlers and others to discover, when exploring various kinds of environment, and about the information-processing mechanisms that produce what AK-S calls "representational redescription" allowing the theorems to be discovered and deployed. (I think architectural changes are needed in many cases.)
There are also transitions in information-processing capabilities and mechanisms that are much harder to detect, though their consequences may include observable behaviours.
A draft (incomplete, messy and growing) list of transitions in biological information processing is here.
The transitions producing new capabilities and mechanisms are examples of a generalised concept of morphogenesis, originally restricted to transitions producing physical structures and properties.
Among the transitions are changes in the mechanisms for producing morphogenesis. These are examples of meta-morphogenesis (MM). The examples of information processing competence described here may occur at various stages during the lives of individuals. The mechanisms that produce new ways of acquiring or extending competences are mechanisms of meta-morphogenesis, about which little is known. Piaget identified many of the transitions in children he observed, and thought that qualitative changes in competence producing competences were global, occurring in succession, at different ages, during the development of a child. Karmiloff-Smith, in Beyond Modularity suggests that transitions between stages may occur within different domains of competence, and will often be more a function of the nature of the domain than the age of the child, though she allowed that there are also some age-related changes. See http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity.html
I have no idea what Karmiloff-Smith would think of my proposal to extend this idea to regarding biological evolution (i.e. natural selection) as (unwittingly) making discoveries about domains of mathematical structures then transforming those discoveries in various ways, as outlined in a separate document on the nature of mathematics and the relevance of mathematical domains to evolution and in a presentation to the PT-AI 2013 conference: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/bio-math-phil.html http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#talk108
Mary Leng has made related claims related to my topic, but disagreeing with my claims, as reported in this book review: http://www.ams.org/notices/201305/rnoti-p592.pdf
Transitions occur across species, within a species, within an individual, concurrently in different species, and in some cases in eco-systems or sub-systems involving more than one species.
A draft (growing) list of significant transitions in types of information-processing in organisms is here: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/evolution-info-transitions.html
People who have not designed, tested or debugged working systems may lack the concepts and theories required.
Exploration here does not necessarily refer to geographical exploration. It can include investigating the space of possible actions on some object or type of object, e.g. things that can be done with sand, with water, with wooden blocks, with string, with paper, with diagrams, etc. [See Sauvy and Sauvy(1974).]
I (and probably others using different terminology) have proposed that although rewards of many kinds (including non-scalar rewards) can be important, there are also non-reward-based forms of motivation, without which a great deal of the learning done by young children (and other animals) would be impossible. That's because the learner is required to select things to do without being in a position to have any knowledge about the possible outcomes. So natural selection has somehow provided motivation triggers that are directly activated by perceived states of affairs or processes, or in some cases thoughts, to create motives, which then may or may not produce behaviours, depending on which other motives are currently active, and other factors. Such a mechanism can produce forms of exploration-based learning that would otherwise not occur. I call that "architecture-based motivation" in contrast with reward-based motivation, as explained in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/architecture-based-motivation.html
The diagram illustrates, schematically, a very simple architecture with motives triggered by what is perceived, but with no computation of, or comparison of, rewards, or expected utility.
In particular, the individual may be unaware of what is being done or why it is being done.
I am not saying that that's a model of human or animal motive-generation, but that something with those features could usefully be an important part of a motive generation mechanisms if the genetically determined motive generating reflexes are selected (by evolution) for their later usefulness in ways that the individual cannot understand. This idea was independently developed and tested in a working computer model, reported by Emre Ugur (2010).
As an individual's competence grows the amount of stored information about each domain grows, extending the variety and complexity of situations they can cope with (e.g. predicting what will happen, deciding what to do to achieve a goal, understanding why something happens, preventing unwanted side-effects, reducing the difficulty of the task, etc.)
Within this framework of behaviour-centred learning much interesting research has been done, and there have been many impressive advances that generalise what can be learnt or speed up what can be learnt, or make what has been learnt more robust.
But I want to raise the question whether this kind of research sheds much light on human intelligence or the intelligence of many other animals with which we can interact, or helps much with the long term practical goals of AI or explanatory goals of AI as the new science of mind. The main problem is all this online intelligence leaves out what can be called "off-line" intelligence, which involves a host of ways of doing something about possible actions other than performing the actions, for example thinking about "what would have happened if...." or explaining why something happened, or why something was not done, or teaching someone else to perform a task, or changing the environment so as to make an action easier, or safer, or more reliable. These abilities seem to be closely related to the abilities of humans to do mathematics, including for example discovering theorems and proofs in Euclidean geometry, which our ancestors must have done originally without any teachers, and without using the translation of geometry into arithmetic that is now required for geometrical theorems to be proved by computer (in most cases).
A subset of species, including young children and apparently some corvids seem to have the additional ability to think about and reason about actions that are possible but are not currently being performed. This can sometimes lead to the ability to reflect on what went wrong, and how faulty performance might be improved, or failure produced deliberately, and in some cases the ability to understand successes and failures of others, which can be important for teachers or trainers. For example, a mother (or 'aunt'?) elephant seeing a baby elephant struggling unsuccessfully to climb up the wall of a mud bath may realise that scraping some of the mud away in front of the baby will make an easier ramp for the baby to walk up, apparently using "counterfactual" reasoning, as required for a designer or planner. A monkey or ape may be able to work out that if a bush is between him and the alpha male when he approaches a female his action will not be detected.
For example, a child who has learnt to catch a fairly large ball may be able to think about what will happen if she does not open out her palms or fingers before the ball makes contact with her. And she may also be able to think about what will happen if she does not bring her fingers together immediately after the ball makes contact with her two open palms.
This uses "off-line" intelligence. More is said about this distinction in Sloman 1982, Sloman 1989, Sloman 1996, Sloman 2006 Sloman 20011
The differences between on-line and off-line intelligence are sometimes misconstrued, leading to poor theories of the functions of vision -- e.g. the theory that different neural streams are used for "where" vs "what" processing, and the theory of "mirror neurons", neither of which will be discussed further here. For more detail see (Sloman 1982) and the related papers below.
On-line and off-line intelligence are sometimes combined, e.g. when possible future contingencies are being considered during the performance of an action, or a partly successful action is not interrupted, but while it is continued the agent may be reflecting on what had previously gone wrong and how to prevent it in future.
Many complex actions, such as nest building, hunting intelligent prey, climbing a tree, eating a prickly pear while avoiding thorns (See Richard Byrne) or constructing a shelter or house require a mixture of on-line and off-line intelligence, often in parallel or alternating performances.
See also the comments about Karen Adolph's work on learning in infants and toddlers below.
The main consequence is that the learner can now work out things that previously had to be learnt empirically, or picked up from teachers, etc. This means that the realm of competence is enormously expanded.
This requires the use of information structures of variable complexity composed of components that can be re-used in novel structures with (context-sensitive) compositional semantics -- one reason why internal languages had to evolve before languages used for communication.
N.B. This is totally different from building something like a Bayes Net
storing learnt correlations and allowing probability inferences to be made.
The same can be said about possible physical structures and processes. Before the first
bicycle was constructed by a clever designer, the probability of it being constructed was
approximately zero.
Bayesian inference produces probabilities for various already known possibilities. What I
am talking about allows new possibilities and impossibilities to be derived, but often
without any associated probability information: if a polygon has three sides then its
angles must add up to half a rotation.
Compare using a grammar to prove that certain sentences are possible and others
impossible. That provides no probabilistic information. In fact a very high proportion of
linguistic utterances had zero or close to zero probability before they were produced. But
that does not prevent them being constructed if needed, or understood if constructed.
For non-logical reasoning, e.g. reasoning about transformations of a set of topological or geometric relationships, similar processes of reasoning without performing physical actions can provide new knowledge of about possibilities and necessities.
Kenneth Craik, Philip Johnson-Laird and others have suggested that internal models can be used for making predictions about possible actions http://en.wikipedia.org/wiki/Mental_model However most of them fail to notice the differences between being able to work out "what will happen if X occurs" and being able to reason about about what is and is not impossible, or what else will necessarily occur if X occurs.Examples of discovering what is impossible are discussed in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/impossible.html
The key idea is that under some conditions it is possible to discover that properties of a schematic structure or schematic process are invariant -- i.e. the properties do not depend on the precise instantiation of the abstraction, though sometimes it is necessary to add previously unnoticed conditions (e.g. no larger object is between the grasping surfaces) for a generalisation to be true.
This idea will have to be fleshed out very differently for different domains of structures and processes, or for different sub-domains of rich domains -- e.g. Euclidean geometry, operations on the natural numbers. (See examples about counting below.)
The kinds of discoveries discussed here are not empirical discoveries, but that does not mean that the reasoning processes are infallible. The history of mathematics (e.g. the work of Lakatos below) shows that even brilliant mathematicians can fail to notice special cases, or implicit assumptions. Nevertheless I think these ideas if fleshed out would support Kant's ideas about the nature of mathematical discoveries, as discoveries of synthetic necessary truths. (As far as I know, he did not notice that the discovery processes could be fallible.)
The ideas in this section are elaborations of some of the ideas in Chappell and Sloman (2007). ___________________________________________________________________________________
This is deeply connected with a Kantian theory of causation. See our 2007 'WONAC' presentations http://www.cs.bham.ac.uk/research/projects/cogaff/talks/wonac/.
[Added 27 Oct 2011]
It is also connected with our discussion of "internal" precursors to the use of language for
communication -- in pre-verbal humans, in pre-human ancestors and in other species.
E.g. see Sloman Talk52 on Evolution of minds and languages.
If we treat language learning as a special case of something more general, found also in pre-verbal children and in other species that can see, think, plan, predict, and control their actions sensibly, that may give us new clues as to the nature of language learning.
It may be useful to distinguish
In other cases, the new entities postulated are not contained in the old ones, for example, when an organism that initially has sensory and motor signals and seeks regularities in recorded relationships, including co-occurrences and temporal transitions, later adds to the ontology additional objects that are not parts of the available signals but are postulated to exist in another space, which can have (possibly changing) projections into the sensory space. One extremely important example of this would be extending the ontology to include objects that exist independently of what the organism senses, and which can be sensed in different ways at different times. The former is a somatic ontology, the latter an exosomatic ontology.
An example, going from sensory information in a 2-D discrete retina to assumed continuously moving lines sampled by the retina, or even a 3-D structure (e.g. rotating wire-frame cube) projecting onto the retina, is discussed in http://www.cs.bham.ac.uk/research/projects/cogaff/misc/simplicity-ontology.html
Ontologically non-conservative transitions refute the philosophical theory of concept empiricism (previously refuted by Immanuel Kant), and also demolish symbol-grounding theory, despite its popularity among researchers in AI and cognitive science.
They also defeat forms of data-mining that look for useful new concepts (or features) that are defined in terms of the pre-existing concepts or features used in presenting the data to be learnt from. (Some work by Stephen Muggleton, using Inductive Logic Programming may be an exception to this, if some of the concepts used to express new abduced hypotheses, are neither included in nor definable in terms of some initial subset of symbols.)
It is easy to see that integers (though not just positive integers) with addition, and also rational numbers, both form groups.
That's because it is possible to discover later that some newly discovered mathematical structure is a group, e.g. a set of translations of 3-D structures, with composition as the group operator..
Many mathematical abstractions go beyond the exemplars that led to their discovery. In fact the discovery may be triggered by relatively simple cases that are much less interesting than cases discovered later. The initial cases that inspired the abstraction may be completely forgotten and perhaps not even mentioned in future teaching of mathematics.
This use of abstraction in mathematics is often confused with use of metaphor. Unlike use of abstraction, use of metaphor requires the original cases to be retained and constantly referred to when referring to new cases, whereas an abstraction can float free of the instances that triggered its discovery.
Some of the problems are discussed in more detail in
And related documents referenced in those.
That is a result of very bad philosophy of science. I'll outline some alternatives.
Much research on children (and other animals) is restricted to looking at patterns of responses to some experimenter-devised situation. This is like trying to do zoology or botany only by looking in your own garden, or doing chemistry only by looking in your own kitchen. It is based on a failure to appreciate that many of the most important advances in science come from discovering what is possible, i.e. what can occur, as opposed to discovering laws and correlations. This is explained in more detail in Chapter 2 of The Computer Revolution in Philosophy (1978) http://www.cs.bham.ac.uk/research/projects/cogaff/crp/chap2.html
How to discover relevant possibilities: First try to find situations where you can watch infants, toddlers, or older children play, interact with toys, machines, furniture, clothing, doors, door-handles, tools, eating utensils, sand, water, mud, plasticine or anything else.
Similar observations of other animals can be useful, though for non-domesticated animals it can be very difficult to find examples of varied and natural forms of behaviour. TV documentaries available on Cable Television and the like are a rich source, but it is not always possible to tell when scenarios are faked.
Some videos that I use to present examples are here: http://www.cs.bham.ac.uk/research/projects/cogaff/movies/vid More examples are presented or referenced below. Some are still in need of development: more empirical detail and more theoretical analysis of possible mechanisms.
This discussion of explanations of possibilities is also relevant:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/explaining-possibility.html
[To be continued.]
Sometimes that requires thinking like a mathematician, as illustrated below in several examples -- a designer needs to be able to reason about the consequences of various design options, in a way that covers non-trivial classes of cases (as opposed to having to consider every instance separately).
That often involves discovering, and reasoning about, invariants of a class of cases. For example, an invariant can be a feature of a diagram that supports reasoning about all possible circles or all possible triangles, in Euclidean geometry. Usually that does not require the diagram to be accurate.
When children are taught to measure angles of a collection of triangles to check the sums of the angles, they are NOT being taught to think like a mathematician.
Sometimes people who are not able think like a designer or a mathematician resort to doing
experiments (often on very small and unrepresentative groups of subjects). I have compared
that with doing Alchemy, here:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/alchemy/
(Is education research a form of alchemy?)
Unfortunately, the educational experience of many researchers includes neither learning to think like a mathematician nor learning to think like a designer.
E.g. many people who can state Pythagoras' theorem, or the triangle sum theorem have no idea how to prove either, and in some cases don't even know that proofs exist, as opposed to empirical evidence obtained by measuring angles, areas, etc.
[Note]
A sustained onslaught against bad science and bad philosophy can be found in
Chapter 12 of David Deutsch
The Beginnings of Infinity: Explanations that transform the world. I
think his criticisms apply to much psychological and neural research on
mathematical competences in humans and other species -- done by scientists who
would not know how to give a robot such competences.
[To be continued.]
I suspect that most animals never achieve use of a global euclidean ontology with global metrics, but that does not stop them seeing things and using vision to select goals and plans and to control their movements, and predict movements of others.
I also suspect that in humans the uses of global metrics and coordinate frames result from long periods of using something more primitive, and that it requires a special education that was not available to our ancestors thousands of years go to be able to think of all lengths (angles, areas, volumes, speeds, etc.) as comparable using a common metric for each quantity. But even without that education toddlers are very effective in coping with most of their normal environments. How is that possible?
Instead of global coordinate systems, perhaps they use less precise and general, but somewhat more complex ontologies based on use of networks of partial orderings augmented with semi-metrical extensions, which use the fact that even without global metrics, it is possible for differences (e.g. in length, angle or area) to be compared, even when absolute values are not available. E.g. The pine tree is taller than the lamp-post by an amount that is greater than the height of the lamp post, but less than the height of the tree between them.
I suspect that is a deep error, and that for many biological organisms instead of probabilities the ontology includes
I.e. if P1 is the possibility of an agent A moving with heading H colliding with the door frame, and P2 is the possibility of A passing through the doorway without collision, A may know that H makes P2 more likely than P1, which is why the heading H was selected. (For more on this sort of reasoning see "Predicting affordance changes".)
However if P3 is the possibility of disliking the food available to A at the next feeding opportunity, there may be no basis for deciding whether P3 is more or less likely than either of P1 or P2. The heading H, which affects the relative likelihood of P1 and P2 will normally be considered irrelevant to P3, even though there may be a theoretical connection, e.g. if A gets seriously injured colliding with the door frame, there may be medical restrictions on food offered during a recovery process. This information may not be available to A, and even if it is available it need not be sufficient to derive a likelihood ordering. Even the most knowledgeable scientist may be incapable of doing that, mainly because the question which is more likely has indeterminate semantic content, since so many different possible but unspecified contexts can affect the comparison.
One aspect of intelligence is the ability to think of contexts that affect the relative likelihoods of possibilities under consideration: that is also a key component of mathematical and engineering design competence.
For discussion of non-metrical aspects of perception of affordances
(possibilities and impossibilities instead of probabilities, and use of partial
orderings instead of scalar measurements) see
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/changing-affordances.html
Predicting Affordance Changes
(Steps towards knowledge-based visual servoing)
I suspect that what actually goes on in learners, which is misinterpreted by the Bayesian theorists, is much more subtle and much closer to discoveries of useful equivalence classes, e.g. concerning which a form of mathematical reasoning can be used. When we find out how to give machines ways of constructing those equivalence classes and ways of reasoning about them, our robots will be far more intelligent and human-like -- or animal-like -- than they are now. NB: I am not an expert on Bayesian mechanisms and may have misunderstandings and gaps in my knowledge.
A simple example: a child can count, and can turn a coin over while counting, may discover
a relationship between the starting position (heads or tails up), the number of turns and
the final position. Initially that may be an empirical discovery, and may even be
expressed probabilistically if the child makes counting errors. But later on the child
will be able to work out what the result must be on the basis of whether the number
of turns is odd or even. There are more complex examples below.
(That's too vague: this is work in progress.)
Some of the examples illustrate portions of the process of information re-organisation (perhaps instances of what Karmiloff-Smith means by "Representational Redescription"?).
The list of examples in this document is a tiny sample. I shall go on extending it. (Contributions welcome.)
NOTE:
The order of the examples presented here is provisional.
Later I'll try to extend the list and impose a more helpful structure.
At first very young children playing with 'lift out' toys like these find it difficult to insert a cut-out picture into its recess, even if they remember which recess it came from.
E.g. They put the picture down in approximately the right place and if it doesn't go in they may press hard, but not attempt any motion parallel to the picture surface.
After a while they seem to learn that both the recesses and movable objects have boundaries, and that when flat objects are brought together the boundaries may or may not be merged.
At this more advanced stage, a child may place the picture object in roughly the right place and then try sliding and rotating until it falls into the recess.
Still later, the child realises that boundaries can be divided into segments and that segment of the recess boundary may match a segment of the object boundary, and then try to insert the object by first ensuring that matching segments are adjacent and then slightly varying the location and orientation of the piece until it falls into the recess.
Long before they can do this, I suspect they can insert a circular disc into a recess, since there is no problem of alignment. If there are different discs and recesses of different sizes the insertion requires size and location to be perceived and used in controlling the insertion process. When the items are not symmetrical, inserting requires
There are similar problems stacking cups, except that in addition to the shape of boundary, the size can be very important, and children may have to learn to order the sizes in order to ensure that all the cups can be stacked. There are probably many intermediate discoveries that can be made and used, some of them red-herrings because they only work by accident in certain conditions, or because they are allow a cup to be stacked but prevent ALL cups being stacked, e.g. placing the smallest cup on or in the largest cup.
See the short, tentative, discussion in this PDF presentation:
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/math-order-stacking-sloman.pdf
Fiona McNeill provided this example of a domain still being explored and only partially understood by the child, in March 2009:
She has stacking cups that go inside one another that she loves to play with. Until recently, getting them to go in the right order was more or less a case of trial and error, but she has just made a big step forward.
She is now very good at noticing 'holes' - so if she has, say, cups 2,3,5,6 all stacked, and 1,4,7 loose, she will immediately remove cups 2 and 3, recognising with no apparent effort that something needs to go between them and the bigger cups 5 and 6.
However, she seems to have no concept of relative size and will, seemingly, pick up either 1, 4 or 7 with equal probability to put them in this hole, not perceiving that 7 is clearly too big to go into 5 or that 1 is clearly too small to fill the hole.
I would have thought that judging relative size, when there is a fairly large difference in the sizes, would be far more instinctive than noticing that cup 3 is a little loose in cup 5, which is not immediately obvious to the eye. Apparently not!
She has also does not have the concept of 'largest object'. If she starts off by picking up the biggest cup (cup 10), she will try to fit it into all the others, and when it will not, instead of trying to fit something into it, she tries again and again to fit it into another one, getting increasingly frustrated. I usually put it down for her and put another in it, and then she is happy to go on putting cups into it, but she has not got this for herself yet."
This toddler (age about 17.5 months) seems to be exploring topology. She spontaneously crawled towards the sheet of card while holding a pencil, picked up the card, pushed the pencil through the hole, pulled the pencil out, moved the pencil up and over the edge of the card while rotating the card toward the pencil then pushed the pencil through the hole from the opposite side, then removed the pencil, reverted to the original side and finally pushed the pencil in then pulled it out again.
Note: this 'gif video' may not work for you in this context. It can also be
viewed in this video, which includes a commentary and some slow motion:
small-pencil-vid.webm
(Old video replaced 19 Aug 2017)
This appears to be a case of "architecture-based" motivation discussed above. There is no need for such behaviour to be generated by anticipation of any kind of reward, although in special cases it could be. But this child seemed to be merely reacting to opportunities in her environment. There were adults and an older child in the room but the toddler seemed not to be paying attention to any of them, and certainly did not appear to be seeking signs of approval during or after her performance.
NOTE: Manipulating the pencil and card, and getting the pencil into the right position and orientation to push it through the card from each side would be a significant challenge for a robot. There is no evidence that she had previously been practising this action with a pencil and a hole in a card, though of course she had pushed other objects through holes, in very different physical circumstances. Note that when moving the pencil over towards the second side of the card she does not even look at the pencil, as she is peering over the card at the 'other' side of the hole. Yet she not only moves the pencil toward the new side of the card, while doing so she also automatically rotates it into the required new orientation. This seems to suggest a good grasp of the 3-D structure of the space she is in and how to move things around in space to achieve some of her goals. Her grasp of space is not perfect as she sometimes has trouble rotating 3D objects into the right orientation to fit through a hole, e.g. rotating a triangular prism to fit through a triangular hole.
This child's ability to talk was still very limited: she could produce some very short sentences in understandable English, and could understand more. However it seems clear that she had complex intentions that her actions were designed to achieve that were beyond her spoken linguistic capabilities, e.g. getting the point of the pencil to the hole, on three different occasions, rotating the card until she can see the hole from the other side, getting the point of the pencil through the hole from that side, removing the pencil, etc. It is very unlikely that those goals could have been expressed in terms of the required sensory and motor signals -- that level of detail would be far too specific: she must have had some more abstract internal language for specifying a state of affairs, which she could use both in order to bring about that state of affairs (by deriving control processes from the specification of the goal) and to check whether it had been achieved, so that a new task could be adopted. There is no reason to suspect that the intended actions were planned in full metrical detail in advance -- an alternative form of representation using partial orderings is sketched in
Of course, similar comments can be made about many other intelligent animals that do not show any sign of using human languages, including nest-building birds, squirrels defeating squirrel-proof bird feeders, parrots able to rotate a nut to a desired orientation by alternatively holding it in beak and foot, hunting mammals bringing down prey, and then extracting food from the interior, and many more. For a discussion of issues related to evolution of vision and language and conjectures about precursors of human language see this presentation: Talk111.
Walking
(To be added)
Falling backwards
(Reported by
Michael Zillich
April 2009. Name of toddler changed.)
"LLLL last week suddenly learned to walk. It seems she figured that handling her little suitcase while crawling was too cumbersome and so just stood up and walked, carrying the suitcase around for hours :)Playing on a trampolineNow she also walks on quite uneven ground outside.
One really nice detail: She is quite good at maintaining balance (briefly stopping to regain it when necessary) and at using her hands (and bottom) to cushion falls, in case balance is truly lost.
But when she is in our bed, with soft cushions and blankets, she loves to stand up straight and simply let herself fall backwards, with a relaxed sigh. She knows she can only do this in bed. We did not teach or show to her (I am too tall to do that) so she had to figure that out herself. And she seems to enjoy the "thrill" of losing control."
It's very hard to find out what such a child does and does not understand by asking questions, though Piaget tried hard (see his last two books, on Possibility and Necessity, translated 1987).
One could (though probably should not) invite the child to try the same action on different surfaces, e.g. a lawn, a hard floor with a thick or thin carpet, a bed on which he is standing, a sandpit, etc.
If the child is old enough to discuss such possibilities probing questions may or may not reveal the stage of theory development (as opposed to skill development). If it's too early for verbal interrogation there may be no substitute for long term observations of spontaneous actions in a playground, perhaps a special purpose playground with different surfaces (and close supervision).
Such research should not be corrupted by spurious requirements to collect statistics about what happens when. It's what can happen, that's important for deep science, and how those possibilities emerge, and how they are constrained. (Piaget understood this, but many of his critics did not.)
Three children on a trampoline
I watched three children on a trampoline. The youngest seem to be pre-verbal
though he could walk and climb. The oldest was a boy who might have been four or
five years old. In between, was a girl who seemed to be at an intermediate age
(and size).
At one stage the girl started going head over heels on the trampoline: jumping in such a way that her hands and head hit the trampoline with her trunk going over. The other two were intrigued.
The little one seemed to want to do something inspired by her tumbling, but did not seem to know what to do. He jumped around a bit stepping with alternate feet on the trampoline then seemed to give up.
The older boy seemed to know that he had to do something about getting his head down, but at first merely made clumsy and ineffectual movements. (I wish I had had a video recorder.) After a few attempts he seemed to realise what was necessary, and managed to go head over heels several times, rather clumsily at first and then apparently with greater understanding of the combination of movements needed to initiate the tumble, after which momentum and gravity could complete the process.
I don't think any of them could express in a human communicative language what they had learnt but clearly there was something in the information structures they created internally, to function as a goal specification, as a control strategy for actions to achieve the goal, as a critical evaluation of early attempts, as a debugging process to modify the details of the action so as to complete and "clean up" the final desired action.
Modelling this on a robot (possibly simulated -- to reduce the risk of damaging expensive equipment!) would not be trivial. The process involves a mixture of fine control with ballistic action and requires sufficient understanding to manage the initial controlled movements in such a way as to launch the right kind of ballistic action.
It does not seem to me that these children are making use of something like the standard statistical AI approach to learning which requires a space of motor (or sensory-motor) signals to be sampled using statistics (and perhaps hill-climbing) to direct the search, possibly using a numerical evaluation/reward function. I suspect they are using richer and more varied information structures in a complex self-improving control architecture.
Karen Adolph's work is also relevant:Adolph (2005)
It is important to distinguish the acquisition of
Riding the back of a sofa
(Added 12 Dec 2013)
Bob Durrant provided this example.
She can straddle the back of the sofa without toppling it over.
Facing right w.r.t the front of the sofa, if she wants to get down from the sofa to the rear of it from a straddling position, then (using her right hand to support herself) she rotates clockwise on her bottom about 90 degrees to bring her left leg over and slides off, using her bottom as a brake to control rate of descent, to land standing up.
If she wants to get off the other way then she either does as above (with her right leg and left arm) to end up standing on the cushions or, because it is more fun, she instead lifts over her left leg and she tumbles backwards on to the cushions.
She has never, as far as I know, tried to tumble the other way (i.e. over the back of the sofa, with a fall of about twice as far on to the carpeted floor).
Prior to this she did similar from the arms of the sofa and armchair, again without ever (intentionally) tumbling the wrong way as far as I am aware."
Playing on a slide
-- trying to throw a teddy-bear to a child at the top of the slide.
-- walking up a slope while holding onto a rope attached near the top.
[To be continued]
____________________________________________________________________________
This is an example of matter manipulation, a type of competence that subsumes tool-use and many other things that have been studied in children and other animals.
A broom can be thought of as a "tool for shifting dirt on a floor", but in the video is not being used in that way. Rather the child appears to be moving the broom around for its own sake, rather than for the sake of some other effect.
Such matter-manipulation sometimes has utilitarian functions (e.g. obtaining food, putting on clothes, getting hold of some object that is out of reach) but need not have. With or without serving an explicit goal of the manipulator the processes seem to be a pervasive type of activity in very young children and also some other animals.
Presumably this is because playful, exploratory, manipulation can provide much information about, for example:
Suppose it is formed from a stretched rubber band held in place by pins.
There are many ways the shape, size, orientation and location of the triangle could be transformed, by moving the pins.
Think of some possible changes do-able by moving one, or two or all three pins, and for each change try to work out its consequences.
That is an easy task for a mathematician since much of mathematics is a result of the human (animal?) ability to look at something and think about how it could be changed, and what the consequences would be.
Most humans do it often in everyday life, e.g. when considering rearrangements of furniture.
The ability to do this develops slowly and erratically in children -- and in cultures! See also (Piaget & others, 1981, 1983)
Among the many possible ways you could alter the triangle, e.g. moving, or rotating the whole thing there is one that involves moving only one pin, parallel to the opposite side, in either direction, e.g. moving the top pin here, parallel to the opposite side (the "base").
Another possibility involves moving the top pin up or down in either direction perpendicular to the opposite side.
Can you see any interesting difference between those two sets of possible changes to the configuration?
One set of changes will increase or decrease the total area of the interior of the triangle.
The other set of changes will leave the area of the triangle unchanged.
Can you see why that must be so? Here's the explanation:
If you don't recognize what's going on, try reading this introduction to thinking about triangles and their areas: http://www.cs.bham.ac.uk/research/projects/cogaff/misc/triangle-theorem.html
The crucial point about such a diagram is that (like all diagrams used in proofs in Euclidean geometry) the relationships perceived in the diagram do not depend on the specific size, shape, colour, location, orientation, etc.
They don't even depend on the diagram being drawn accurately (with perfectly thin, perfectly straight lines). That's because once the proof is understood correctly its scope covers a very large class of abstraction. It's not clear that people not trained in mathematics can easily think that way.
There's an interesting 'bug' in the proof-sketch as shown in the diagram which is related to the need to do proper case analysis. It's a simple example of the sort of phenomenon discussed by Imre Lakatos in Proofs and Refutations, mentioned below. The bug in the 'chocolate' theorem, discussed below, is another example. Identifying the bug is, for now, left as an exercise for the reader, though mathematicians will find it obvious.Max Wertheimer discussed an analogous bug in a proof given by a school teacher regarding the area of a parallelogram, described in his book Productive Thinking. More examples of buggy, but fixable, proofs are given below.
[The relationship between this sort of bug and the problems a child has in handling exceptions to grammatical rules in language may be illuminating, as regards information processing architectures and mechanisms required.]
This human ability to reason about necessary consequences of alterations to configurations in the environment may be closely related to Kenneth Craik's hypothesis that some animals can use internal models of the environment to work out consequences of possible actions. (Craik, 1943)
Compare also (Karmiloff-Smith, 1992), and Piaget's work on possibility and necessity, and also Kant's philosophy of mathematics (Kant 1781).
Work that remains to be done includes finding out how a child, or non-human animal, or future robot, could notice that some collection of structures and processes forms a domain that has interesting properties, including invariants that are discoverable by reasoning about the structures and relationships, how the relationships can be discovered and supported by a non-empirical argument, how different domains can be combined to form new domains of expertise, and how all of this can lead to the phenomena of Representational Redescription discussed by K-S.
We also still need to understand how to get robots and other learning machines to go through similar procedures. See also:
http://www.cs.bham.ac.uk/research/cogaff/96-99.html#15 A. Sloman, Actual Possibilities, in Principles of Knowledge Representation and Reasoning: Proc. 5th Int. Conf. (KR `96), Eds. L.C. Aiello and S.C. Shapiro, Morgan Kaufmann, Boston, MA, 1996, pp. 627--638, Added 11 Sep 2013____________________________________________________________________________
Based partly on ideas by Mary Pardoe developed while she was teaching children mathematics. Here's an extract from that discussion:
ADDED 10 Sep 2012, Updated 9 Apr 2013:
A more detailed analysis of requirements for discovering theorems in geometry
is: "Hidden Depths of Triangle Qualia"
http://tinyurl.com/BhamCog/misc/triangle-theorem.html
____________________________________________________________________________
If you have a rubber band (elastic band), some pins, and a board into which the pins can be stuck, you can make figures by using the pins to hold the band stretched into a shape bounded by straight lines (if the band is stretched between the pins).
The following are sample questions about what is possible, what is impossible, and how many pins or rubber bands are needed to make something possible.
For example, you can make a triangle, a square, an outline capital "T" with one
rubber band and a set of pins?
Is it possible to make an outline capital "A" ?
Is it possible to make a circle?
Is it possible to make a star-shaped figure, with alternating convex and concave corners?
What's the minimum number of pins required for that?
How can you be sure?
For more examples see
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/rubber-bands.html
http://www.cs.bham.ac.uk/research/projects/cogaff//talks/#toddler
Here's a video of a child feeding yogurt to his belly, his thigh and a carpet, and doing several kinds of experiment with yogurt and spoon, presumably feeding his mind, though he probably does not know that: http://www.cs.bham.ac.uk/research/projects/cosy/conferences/mofm-paris-07/sloman/vid/yogurt-experiments-10mths.mpg
There are more videos with very short comments that need to be expanded, here: http://www.cs.bham.ac.uk/research/projects/cosy/conferences/mofm-paris-07/sloman/vid/
For a PDF presentation on learning about different kinds of 'stuff' see
http://www.cs.bham.ac.uk/research/projects/cogaff/talks/#brown
From "baby stuff" to the world of adult science: Developmental AI from a Kantian viewpoint.
(Talk at Brown University 2009)
"Today might be much more hotter than it usually bees"More generally, the phenomena of "U-shaped" language learning provide many clues as to what goes on when information fragments acquired empirically are transformed into a "deductive" system, when the system needs to be capable of handling exceptions -- unlike the systems of topology, geometry, and other kinds of proto-mathematical knowledge.
Consider a slow moving van and a fast moving racing car. They start moving towards each other at the same time.
The racing car on the left moves much faster than the van on the right: Whereabouts will they meet -- more to the left or to the right, or in the middle?
One five year old answered by pointing to a location on the left, somewhere near "b" or "c".
Me: Why?
Child: It's going faster so it will get there sooner.
What produces this answer? Could it be:
Here are some fragments that may have been learnt, but perhaps without all their conditions for applicability fully articulated.
The first premiss is a buggy generalisation: it does not allow for different kinds of "race".
The others have conditions of applicability that need to be checked.
Perhaps the child had not taken in the fact that the problem required the racing car and the van to be travelling for the same length of time, or had not remembered to make use of that information.
Perhaps the child had the information (as could be tested by probing), but
lacked the information-processing architecture required to make full and
consistent use of it, and to control the derivation of consequences properly?
____________________________________________________________________________
Is Vygotsky's work relevant?
Some parts of Piaget's theory of "formal operations"?
Compare Karmiloff-Smith on "Representational Redescription",
discussed in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/beyond-modularity
Could the child's reasoning be evidence for a process of representational redescription that is still incomplete: i.e. generally useful items of information that can be recombined in different contexts have been extracted from the collection of empirically learnt associations. But the conditions for recombination, and the constraints on applicability of inferences, have not yet been discovered. In principle, this looks like a type of learning that could be modelled in terms of construction of a rule-set capable of supporting deductive inference.
(I think Richard Young's PhD thesis around 1972 was concerned with a process
something like this, but involving ordering of objects by height.)
____________________________________________________________________________
Today, our daughter Ada (named for Lovelace), who turned 2 earlier this month, said "Kitties have tails. I do not have a tail. I'm not a Kitty."Is it possible that a two year old has grasped the general principle that from premises
Xs have Ys A doesn't have Yit follows that
A is not an X ?Some initial thoughts about this:
It is possible that children start noticing patterns of related truths and only later, as a result of some form of "representational redescription" (see this discussion of Annette Karmiloff-Smith's work), grasp the general principles.
During the transition various partial competences may be displayed. This is consistent with the theory of "meta-configured" competences in Chappell & Sloman 2007.
Ada may be a highly precocious and unusual logician. Another option is that she was not making an inference, merely noticing that there's an important structural relationship between the three assertions. On Karmiloff-Smith's theory, learners can develop a high level of empirical competence before they do the structural reorganisation that allows old generalisations to become 'theorems' (not her word) along with many that become derivable only after the old information structures are replaced with new "generative" forms.
Investigating a young mind is a very difficult thing to do. Non-performance in tests generally proves nothing at all, and even successful performance can be hard to interpret.
We could try delicately, and tactfully, probing, by finding a way to introduce structurally similar new examples to see if she draws similar conclusions. E.g. Wombles can talk. Kitty can't talk. Is Kitty a womble?
We can also try delicately to set up situations in which other logical patterns arise and find out when she does and when she doesn't draw new conclusions.
(Presumably "I'm not a Kitty" wasn't a new discovery at that moment. So she may merely have added it as an interesting observation, not an inference. I think that's part of Vygotsky's theory of development.)
Compare the fallacious reasoning about the racing car and van reported above. The child's 'representational redescription' to support mathematical reasoning about motion and relative speeds was not yet complete.
There may be no normal patterns of development: only individual trajectories through complex terrain, some of them possibly shared with other species that can never tell us (or each other) what they have learnt. So perhaps Ada had reached an unusual 2-year old grasp of at least a subset of logic.
P or Q not-P Therefore QHe was assembling a jigsaw puzzle with help from an adult. Together they had reached the stage were there were two pieces left and two gaps in the puzzle. He picked up one piece and tried fitting it into one of the holes in various orientations, and failed. He then tried the other hole and succeeded. After that, in an exaggerated ceremonious mode he picked up the last piece and as he moved it towards the remaining hole announced "So this piece must go .... here".
Why did he not say "So this piece goes here" ? Perhaps there was some sort of understanding that the previous success had made something impossible, leaving only one option when there had previously been two. Alternatively, he may simply have noticed the difference between previous situations where each piece could potentially fit into several holes, requiring tests to be done to select the right one (which in some cases can be done perceptually, when a shape is very unusual) and the new situation were there is only one option.
This discussion is merely intended to indicate that we may not have good theories about possible transitions in a child's mind, and therefore are not in a position to use evidence to support one theory.
A note on logic and rules
Logical correctness is often mistakenly regarded (e.g. by philosophers) as conformity
with some set of rules. But that cannot be right.
Making logical inferences of the sort we are considering always involves noticing that something is impossible. What makes it impossible is not conformity to some rules but structural relationships within the example.
Logicians (starting with Aristotle, or some of his predecessors) notice some kinds of impossibility that other people detect and use unthinkingly (e.g. the impossibility of P or Q true, P false, Q false). So a rule gets formulated: if P or Q is true and P is false, then Q *must* be true. Similar things happen in the discovery of geometrical theorems.
But the rules do not explain the necessity. They merely express discovered generalisations. There are many different philosophical theories about what to say next, including the theory that we create the mathematical truths by adopting the rules. Any working mathematician knows that's false, as did Kant.
(There's much more to be said about this.)
Does starting from a different configuration change what is possible? Can you get from configuration (a) below to configuration (b), using only diagonal moves?
The next one is harder:
How people work on such problems differs according to prior knowledge and experience.
Sometimes proving that something is impossible can be done by exhaustive search (though understanding the need to ensure that the search is exhaustive is an achievement, as is organising the search so as to ensure exhaustiveness.
A different kind of competence can lead to a much more economical explanation of why the task is impossible.
The core characteristic of mathematical thinking, which frequently motivates new developments in mathematics is productive laziness, which I suspect begins to develop between ages 1 and 3 years.
This is a case where the advance of knowledge involves noticing that a particular problem is a special case of a general type of problem.
(If a problem is too hard to solve, trying a harder one sometimes gives new insights.)
If you have not noticed the easy way to solve the above problems consider what difference it would make if the squares were black and white, as on a chess board. Mathematicians can use the notion of "parity" here. E.g. giving squares coordinates, they can be divided into two classes: those whose coordinates sum to an even number and those whose coordinates sum to an odd number. The squares in a horizontal or vertical line will have alternating parity.
Squares in a diagonal line will have the same parity. This makes it very easy to check whether a start configuration can be transformed to a target configuration.
Normally such discoveries are made only by adult or bright mathematical learners. My point is that a young child could learn some of the generative facts about the diagonal moving coin domain by playing. Using a two-colour grid will make some things easier to learn. (Why?)
See
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/orthogonal-competences.html#blanket
____________________________________________________________________________
If they cannot find such an object, but they understand what it is about the
screwdriver or spoon that makes it a suitable tool, some of them will notice the
possibility of using the lid of another tin instead of a screwdriver, to lift
the stuck lid.
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/orthogonal-competences.html#lids
____________________________________________________________________________
What sequence of movements could get the shirt onto the child if the shirt is made of material that is flexible but does not stretch much? Why would it be a mistake to start by pulling the cuff over the hand, or pushing the head through the neck-hole? What difference would it make if the material could be stretched arbitrarily without being permanently changed?
Search for: Mr Bean, Rowan Atkinson, trousers, beach, or watch
this video:
http://www.youtube.com/watch?v=ZWCSQm86UB4)
The figure comes from this paper on 'Diagrams in the mind':
http://www.cs.bham.ac.uk/research/projects/cogaff/00-02.html#58
I am not aware of any AI system that can be mystified in this way, let alone one
that can enjoy being mystified, as children can be. For examples see:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/rings.html
There are many more puzzles shown at the "MrPuzzle" web site http://www.mrpuzzle.com.au/, for example: http://www.mrpuzzle.com.au/images/ropes.jpg
Dealing with such puzzles requires the ability to think about transformations of physical objects that preserve topology, involving flexible inelastic strings, beads, discs, and various rigid objects with holes and slots through which string and other things can pass.
In many cases it is also important to make use of non-topological relationships such as relative size (e.g. a bead is too large to pass through a hole, and a string loop is too short to pass over the far edge of an object).
In such cases, an important kind of discovery is how an alteration that does not transform the topology can transform a metrical relationship. E.g. pulling part of a string from one portion of the puzzle to another portion can increase the size of a loop until some object can pass through it that previously could not.
For each class of puzzle there can be a wide range of possible actions to consider. In particular the learner may need to or discover:
There seem to be many different domains/microdomains a learner can explore: including the possible processes associated with a particular puzzle, the possible processes associated with a class of puzzles, and the possibilities created by combining features of different puzzles.
For more on such puzzles and formal reasoning about them see
Pedro Cabalar and Paulo E. Santos, Formalising the Fisherman's Folly puzzle, AIJ, 175,1,pp 346--377, 2011 http://www.sciencedirect.com/science/article/pii/S0004370210000408
NB Looking at the sophisticated logical formalism developed in that paper to enable a computer to reason about such puzzles it seems clear that what their AI system does is very different from what a logically and mathematically naive human might do when looking at the same puzzles and thinking about actions that would change relationships, e.g.
"If I push that disk through the slot, I shall then be able to slide the ring up over the top of the post, but..."Such thoughts seem to make intrinsic use of the structure of the perceived scene in something like the way described in Sloman 1971.
(The 1971 paper made a distinction between "Fregean" representations, where all syntactic complexity represents application of functions to arguments, and "Analogical" representations in which parts of the representation represent parts of what is represented, and properties and relations within the representation represent properties and relations within the thing represented.It is often assumed that analogical representations must make use of isomorphisms, but the paper showed that that is not true. In particular a particular syntactic property or relation (in the representation) can have different semantic functions in different contexts, representing different properties and relations in the scene depicted. That's trivially obvious for 2-D representations of 3-D scenes, since isomorphism is impossible in that case.)
These questions are all related to the question: what sort of understanding of the puzzle (and what form of representation of that understanding) allowed the authors to discover the axioms that characterise it well enough to be used by an AI system? This is also related to the problem of how our ancestors perceived, thought and reasoned about spatial structures and relationships before Euclidean geometry had been codified, and even longer before cartesian coordinates were used to represent geometry arithmetically and algebraically.
It seems very likely that those pre-Euclidean and pre-Logical forms of representation and reasoning are still used, unwittingly, by young children and by other animals with spatial intelligence, e.g. nest-building birds and hunting animals.
Numerical competences are widely misunderstood, in part because of a failure to distinguish what could be called "Numerosity" which can be detected as a perceptual feature (related to area, or volume), as opposed to cardinal number or cardinality, which is inherently concerned with one-to-one mappings (bijections). To a first approximation, this is a difference between recognising a pattern based on two measures (density plus area or volume) and applying a sequential procedure (algorithm) that produces a "result" -- e.g. the result of counting elements of a set. It is also possible in some cases to parallelise (parallelize) that algorithm, e.g. by getting a collection of people all to sit on chairs and then seeing whether any chairs or any people are left over; or checking that two collections of dots are linked by lines where every line joins a dot in one set with a dot in the other, and no dot has more than one line ending on it.
To complicate things, the difference between numerosity and cardinality is much less sharp when the numbers involved are small. (But it is not unusual for different mathematical sequences to have a common "limit".) Another complication is that both numerosity and cardinality are different from, but closely related to various notions of measurement along a scale, used in science and engineering such as the "ordinal", "interval" and "ratio" scales distinguished by S.S. Stevens in 1946 as explained in http://en.wikipedia.org/wiki/Level_of_measurement Unfortunately, he also applied a notion of 'scale', which he called a "nominal scale", to an inherently un-ordered collection of labels.
Another distinction that can be made among scales is between orderings (using relations "more", "less" and "same") that are discrete, as in sizes of families, and those allow continuous variation, as in length, area, volume, mass, etc. The orderings need not be total since some cases may be incomparable, in which case a "partial" ordering exists. It is possible to define these concepts with great precision, but for people who are unfamiliar with the required formal concepts it is easy to confuse the different sets of relationships, or worse, to assume that there is one concept of number which an individual does or does not have.
One of the features of toddlerhood is that the early stages of all of these important and importantly different systems of concepts developed without the learners or their parents or teachers having any idea that such a complex set of structures is being constructed.
And that's before there's any learning about negative numbers, fractions or a mathematically precise notion of the real continuum.
At present I don't think we have an adequate collection of information-processing models to represent the different processes of construction in different domains (e.g. tactile, auditory, visual, and motor control domains) and the powerful mechanisms of abstraction that unify them into different families, so that, for example, 'more' and 'less' can be applied to height, width, angle, area, spatial volume, rotation, linear or rotational speed, weight, force, acoustic properties (e.g. pitch and volume), motor properties (pushing, pulling, twisting, or bending more or less hard, etc.), and grasping the differences between processes where becoming more or less are continuous processes and those where they must be discrete, both of which allow reduction to a "zero" case (an empty set, an infinitely small length or angle or speed) or the opposite extreme (getting "more and more" X indefinitely).
Most empirical or modelling research latches onto some small subset of relationships in this rich and tangled (but ordered!) network without the researchers understanding what they are not attending to.
For now I'll end with a few comments on two sets of concepts that are particularly often confused, or if the difference is noticed it is not described accurately.
When the set of items exists in the environment that estimate can be right or wrong: there will a definite number of them.
But when the items are experiences, e.g. experienced sounds, or texture elements, the sophistication of the perceptual processing mechanisms in producing these experiences may not allow there to be a definite number of elements. For example, even if there is a definite number of stars and planets visible in the sky from a particular location, it does not follow that a human or other animal looking at that sky has a definite number of starry experiences. (This is one of several reasons why an information-processing account of "qualia" requires a kind of detail that's missing from every theory of mind I have ever encountered.)
One problem with a concept of numerosity based on combining (a) an ability to detect and estimate density and (b) an ability to detect and estimate some sort of spatial or temporal extent (of a linear interval, an area, a volume, a temporal interval, etc.) is that when the density varies across the items, then an average density has to be computed to get a measure for the whole set. Since density is already an average, that requires averaging a spatially varying average -- a non-trivial computation. Another problem is detecting whether two densities or two areas are the same. The larger the areas the harder it may be to compare densities accurately. In particular, the harder it is to tell if A's numerosity is greater than B's. So more dots may need to be added to a large collection to make the size difference noticeable. This means that the graph of perceived numerosity against actual cardinality flattens out as cardinality increases. This may take a logarithmic form.
(I have no idea whether anyone has actually investigated which of these computations brains are capable of, for which modes of sensory input.)
If a child (or animal) with an ability to estimate numerosity as described above, perceives two groups G1 and G2 which have both different sizes and different densities comparing numerosity is much harder than where G1 and G2 have the same density, or the same extent. If the density is roughly uniform within each group, and if the perceiver can compute numerical values for both density and area or volume, then the two numbers can be multiplied to provide an estimate of numerosity. The ability to multiply seems to require a prior grasp of numbers, but that can be avoided if the multiplication is done by dedicated, domain specific machinery. In that case, there can be no comparison of numerosity of a sequence of heard sounds and numerosity of dots scattered around an area.
However when both numbers are small they can be compared directly by some form of counting, or setting up a one to one correspondence between the sounds and the dots. That will show if there are more of one than the other. So in that case the ability to estimate cardinality directly removes the need to compute numerosity by performing a multiplication of density and extent.
It seems that humans can compute and compare numerosities from quite an early age (e.g. before being able to count), but they get better as they grow older (and presumably have more experiences of numerosity judgements), and also gradually get a better meta-cognitive understanding of what they are doing. Before that, as Piaget showed, they can display extraordinary confusions because they don't yet have a concept of cardinality as something that is conserved as objects are packed closer together or spread out more.
If the distribution of items in the space is highly irregular the task of comparing numerosities can become very difficult, and in some cases deceptive. There's a lot more to be said about numerosity, but for now the main point is that it is a totally different concept from cardinality, which is fundamentally connected with the notion of a one to one mapping, and researchers who don't make this distinction often write as if there were just one concept of number.
The following seems to be a fairly standard (but mostly unnoticed by researchers??) way of acquiring cardinality competences, though these components are not learnt in sequence, but interleaved:
A child with those competences organised into a deductive system has the basis for making an infinite collection of new discoveries.
E.g. If counting Xs produces the number 5, what will happen if they are counted in the opposite direction? At a certain stage the child will not know, without trying. The answer is discovered empirically.
At a later stage the child will think that the question is stupid. What exactly is that transition? Does anyone have any idea what changes in brains, are required to produce that insight?
I conjecture that much of what happened in our ancestors to enable them to make
these discoveries is still going on unnoticed in young children (and some other
animals) as they play with and gain various kinds of mastery over, their
environments. In that case, by the time we start teaching mathematics to
children in school we are using sophisticated apparatus about which teachers
know nothing, or very little. So they have no idea why or how their teaching
works. Neither do developmental psychologists.
_______________________________________________________________________________
If all the strings connecting objects on the left
have their ends swapped, the same objects will
still be connected by the same strings.
If the connections on the left ends of strings are
preserved, but the right ends are detached and
rearranged, how many different ways are there of
connecting the ends on the right to objects on the
right?
|
_______________________________________________________________________________
This section has been moved into a separate file, where it will be (gradually) extended (but not indefinitely!).
A child given a set of wooden cube-shaped blocks can do all sorts of experiments -- exploring the space of processes involving the blocks.
Then the child may notice that attempts to rearrange some configurations, e.g. a configuration of 11 blocks, into a rectangular arrangement always fail: What kind of experimentation can that provoke, and what sorts of discoveries can be made?
How could one be sure that there is NO way of arranging the last collection into a rectangular array, apart from the straight line shown?
Could such a child discover the concept of a prime number?
Could the child rearranging blocks discover and articulate the fundamental theorem of arithmetic? (The unique factorization theorem.)
Are some forms of mathematical discovery impossible without a social environment?
Don't assume a teacher with prior knowledge of the theorems has to be involved: someone must have made some of these discoveries without being told them by a teacher.
NOTE 1 One of the fundamental requirements for mathematical thinking is being able to organise collections of possibilities and making sure that you have checked them all.
If you can't do that you don't have a mathematical result, only a guess.
NOTE 2 (9 Aug 2012) I have just discovered that this kind of discovery of primeness by a computer program was discussed in
It is not always noticed that without the sophisticated apparatus of modern mathematics many measures form only partial orderings.
E.g. at a certain stage areas or volumes may be comparable only if one shape can fit entirely inside another. So a long thin rectangle and a circle whose radius is less than the length and greater than the breadth of the rectangle are not comparable in area, at that stage. (As far as I know this was ignored by Piaget and all the researchers inspired by his work.)
For example, several different competences are required in order to rank the areas A, B, C and D in the following figure.
Someone who can accurately visualise the effect of moving one bounded area while another remains fixed, or who can cut out the area and move it onto another, may discover that area A can fit entirely inside B. So the area of A is less than the area of B.
However, the shape A cannot be contained in C, and C cannot be contained in A. Moreover, C cannot be contained in B, and B cannot be contained in C. This means it is impossible to rank shapes A, B and C in area on that criterion. They form only a partial ordering relative to the containment criterion.
Someone who has solved the non-trivial problem of assigning measures of area to rectangular shapes, and then discovered that that can be extended to a way of assigning measures to triangles:
area = half(base x height)might then discover (how?) that any area bounded by straight edges (i.e. any polygon) can be systematically divided into triangles, so that the area can be computed by triangulation, followed by summing all the areas of the triangles. That will enable each of the three shapes A, B, and C to be given a numerical measure of area, instead of just a partial ordering of spatial extent defined in terms of containment.
But a polygon can be divided into triangles in different ways, so the argument assumes that different triangulations of the same total area will produce triangles whose sums are all the same. Is that obviously true? (It may seem to be obvious if you start from the assumption that the measure of area of an arbitrary shape is uniquely defined. But that assumption requires justification. In fact there is a lot of non-trivial mathematics concerned with investigation of things that seem obvious to non-mathematicians.)
If we attempt to generalise the notion of area to a region not bounded by straight lines, like figure D, then there is no way to convert that region into a set of triangles. Our simple partially ordered notion of relative area defined by containment can still be used. For example, figure A can be re-located to fit entirely inside D, though that may not be obvious to everyone.
If, however, we wish to extend the notion of a measure of area, so as to provide a total ordering of areas that includes shapes with curved boundaries, like D, then a different approach is required. In fact it requires the use of integral calculus and concepts of limits of infinite series, which were invented by geniuses like Newton and Leibniz and not fully clarified until the mathematics of the 19th Century. (Some might say: not even then!).
There are also problems about the justification for talking about cardinality of large collections of objects (like the visible stars on a clear night, or the leaves on a big tree) where we do not have any chance of counting them, e.g. because they exist for a very short time, or because they are in constant motion, or for some other reason.
All this means that when researchers ask whether children or animals have concepts of size or number they often have no idea of the variety of interpretations that their question can have, with different answers being appropriate to the different interpretations. It is probably fair to say that most members of the adult population of any country on this planet lack well-defined concepts of area and volume. (It may be assumed that area and volume can always be defined in terms of the results of weighing, but that typically assumes the notion of uniform density, which in turn assumes notions of weight and volume.)
It is not clear which of these competences (relating to cardinality, mappings and measures) a child can acquire without help. The ontologies required, the invariants, and the applications, all must have been discovered originally piecemeal, perhaps in inconsistent fragments, without help, and then organised into a shared system through some collaborative process, probably over many generations, long before Euclid's time. I don't know if we'll ever find definitive evidence for those aspects of our pre-history. But perhaps we can replicate some of them in future intelligent robots. And if we look carefully, asking the right questions, we may be able to see some of the fragments in child development, though not all fragments will necessarily appear in all children: there are many routes through this maze of ontologies.
There is further discussion on related topics in this 2010 workshop paper: http://www.cs.bham.ac.uk/research/projects/cogaff/10.html#1001
Alan Bundy has reminded me that some children learn from clock faces and other structures that it is possible to do a kind of counting that goes up to a certain number and then re-starts from 1, for instance reciting the numbers on an old-fashioned clock face.
For mathematicians, this is a special case of 'modulo' arithmetic, namely arithmetic in which there is only a finite set of numbers and counting beyond the largest number always starts again from 1.
For example, 3+4 modulo 5 is 2, 3+4 modulo 6 is 1, 3+9 modulo 6 is 0.
If we assign numerical coordinates to rows and columns of a chess board, then associate each square on the board with the sum of its coordinates, then the bottom left 3x3 corner would have these numbers:
456 345 234
sum of coordinates modulo 2then the bottom left corner would have a different collection of associated numbers with new symmetries:
010 101 010
This is an old and well known puzzle to which new "wrinkles" are added below.
You have a slab of chocolate in the form of a 7 by 7 square of pieces divided by grooves, and you want to give 49 friends, each one piece. You have a knife that can cut along a groove. What is the minimum number of groove cuts that will divide the bar into 49 pieces? RULES FOR CHOCOLATE CHOPPING: Stacking or overlaying two or more pieces, or abutting two pieces, to divide them both with one cut is not allowed: each cut is applied to exactly one of the pieces of chocolate.The puzzle draws attention to a domain of processes of subdivision of a rectangular array into its component elements by a succession of linear slices.
So they can make empirical discoveries but cannot make mathematical discoveries.
It's an exception because the original argument assumed that every cut divides one piece into two pieces.
With holes, is it a slab or isn't it?
Often a proof in mathematics that seemed valid works for a range of cases, but has counter-examples not thought of when the proof was constructed, or when it was checked.
Many such examples connected with the history of Euler's theorem about plane polyhedra were discussed in this famous book.
Imre Lakatos: Proofs and refutations: The Logic of Mathematical Discovery Cambridge University Press, 1976One of the consequences of our ability to perceive, imagine, or create instances of novel possible configurations is that we can sometimes create new configurations that refute our mathematical conjectures, generalisations or even proofs.
This is different from the empirical refutation of "All swans are white", which turned up in Australia.
In defining the problem, I had not noticed the need to specify that every cut must go from one boundary point to another: i.e. no cuts may begin or end at a point that is completely surrounded by chocolate.
This example illustrates the relationship between (a) simple everyday activities, and variations that are clearly intelligible to ordinary people with no knowledge of abstruse mathematics, and (b) deep concepts from topology.
Alison Sloman later pointed out that the counter-example might have been ruled out in advance by requiring portions of the slab to be broken rather than cut.
It is important not to inflate Lakatos' argument in Proofs and Refutations as demonstrating that there is never any real progress in mathematics, or that mathematics is empirical.
On the contrary, every mistake that leads to a revision of a definition, or a statement of a theorem, or a proof adds to our mathematical knowledge: mathematicians can make non-empirical discoveries without being infallible.
The most important philosophical point arising out of his survey of the history of Euler's theorem about polyhedra is, arguably, that just because mathematical knowledge is about necessary truths, not contingent truths, is not empirical, and is also not trivial (analytically provable on the basis of definitions plus logic and nothing else), it does not follow that mathematical discovery processes are infallible. On the contrary, mathematicians can make mistakes, and can often discover that they have made mistakes, and patch them.
The same is true of toddlers who (unwittingly) discover and use theorems.
The other video was of a toddler standing near the left edge of a closed door holding a credit card so that it was in the vertical slot between the door and its frame. He smoothly moved the card up and down in the slot. Then, apparently unprompted, he noticed the slot on the opposite edge of the door and inserted it there and moved it up and down smoothly. The first configuration required his arm to move up and down roughly in front of him. Because he did not move across to the opposite edge, the second action involved his right arm being extend away to the right, producing a very different geometric configuration and pattern of changes of joint angles and forces required to move the card vertically.
He did not seem to need to learn how to produce the new motion. My guess is that he was not controlling the card by aiming to modify joint angles or aiming to produce specific sensory motor signal patterns. Rather in each case he knew in which direction (in 3-D space) the card had to move, and because it was constrained by the slot it was in, all he had to do was apply a force roughly upward or downward using a compliant grip that allowed the sides of the slot to provide the required precision (a toddler theorem). Applying a vertical force requires different motor signals in different arm positions but visual and haptic/proprioceptive feedback would suffice to control the motion.
I asked Alex what would happen to the robot if it were moved some way to one side, so that turning the crank required a new collection of angles, forces, etc. He said it would fail and would have to be retrained.
I presume that's because the robot had not worked out the toddler theorem that to move a crank handle you need to keep adjusting the force so that it is in the plane of rotation but perpendicular to the line from the axle to the handle. Instead, all it had learnt was statistical correlations in its sensory-motor signals. It was stuck with a somatic ontology, whereas it needed an exosomatic ontology, in order to exercise off-line intelligence, as discussed above.
The little boy almost certainly used an exosomatic ontology both in formulating his goals and in controlling his actions. Why did he want to perform those actions? I expect that was an example of architecture-based, not reward-based, motivation, described in Sloman (2009)
Note added 21 Apr 2015
Alex and colleagues have a paper on a robot learning to slide a credit card in a
vertical gap:
http://home.engineering.iastate.edu/~alexs/papers/ICRA_2012/ICRA_2012.pdf
Learning to Slide a Magnetic Card Through a Card Reader
Vladimir Sukhoy, Veselin Georgiev, Todd Wegter, Ramy Sweidan, and Alexander Stoytchev
Presented at ICRA 2012, with an associated video:
http://home.engineering.iastate.edu/~alexs/papers/ICRA_2012/ICRA_2012_video.mp4
Here is a more recent video reporting on work in his lab.
http://www.sciencechannel.com/tv-shows/brink/videos/brink-robots-become-human/
_____________________________________________________________________
Things you probably know, but did not always know:
The idea of an "aspect graph" can be viewed as a special case of a domain of actions related to changing epistemic affordances (as defined above).
That's not normally how aspect graphs are presented. Normally the aspect graph of an object is thought of as a graph of topologically distinct views of the object linked by minimal transitions. For example as you move round a cube some changes in appearance will merely be continuous changes in apparent angles and apparent lengths of edges, but there will be discontinuities when one or more edges, vertices or faces goes in or out of view. In the aspect graph all the topologically equivalent views are treated as one node, linked to neighbouring nodes according to which movements produce new views, e.g. move up, move down, move left, move diagonally up to the right, etc. For a non-convex object, e.g. an L-shaped polyhedron the aspect graph will be much more complex than for a cube, as some parts may be visible from some viewpoints that are not connected by visible portions.
Here's a useful introduction By Barb Cutler:
http://people.csail.mit.edu/bmcutler/6.838/project/aspect_graph.html
Some vision researchers have considered using aspect graphs for recognition purposes: a suitably trained robot could see how views of an object change as it moves, and in some cases use that to identify the relevant aspect graph, and the object. (Related ideas, without using the label "aspect graph" were used by Roberts, Guzman and Grape for perception of polyhedral scenes in the 1960s and early 1970s, though the scenes perceived were static.)
However, for complex objects aspect graphs can explode, and in any case, we are not concerned with vision but with understanding perceived structures. A perceiver with the right kind of understanding should be able to derive the aspect graph, or fragments of it, from knowledge of its shape, and use that to decide which way to move to get information about occluded surfaces.
In 1973 Minsky introduced a similar idea for which he used the label "Frame system". http://web.media.mit.edu/~minsky/papers/Frames/frames.html
A few years ago, in discussion of plans for the EU CoSy project http://www.cognitivesystems.org/, Jeremy Wyatt suggested an important generalisation. Instead of considering only the effects of movements of the viewer on changing views of an object we could enhance our knowledge of particular shapes with information about how things would change if other actions were performed, e.g. if an object resting on a horizontal surface is touched in a particular place and a particular force applied, then the object may rotate or slide or both, or if there is a vertical surface resisting movement it may do neither.
This suggested a way of representing knowledge about the structure of an object and its relationships to other surfaces in its immediate environment, in terms of how the appearance of the object would change if various forces were applied in various directions at various points on the surface, including rotational forces.
This large set of possibilities for perceived change, grouped according to how the change was produced, we labelled a "Generalised Aspect graph". This would be even more explosive than the aspect graph as more complex objects are investigated. For various reasons, we were not able to pursue that idea in the CoSy project (though a subset of it re-emerged in connection with learning about the motion of a simply polyflap in work done by Marek Kopicki).
In currently favoured AI approaches to perception and action the standard approach to use of generalised aspect graphs would require a robot to be taught about them in some very laborious training process.
In the context of an investigation of "toddler theorems" the problem is altered: how can we give a robot the ability to understand spatial structures and the effects of forces on them so that instead of having to learn aspect graphs, or generalised aspect graphs, it can derive them, or fragments of them, on demand, as part of its understanding of affordances.
That, after all, is what a designer of novel objects to serve some purpose needs to be able to do.
However, in order to reduce the combinatorics of such a derivation process I suggest that the representation of objects used to work out how the would move, should not be in terms of sensory-motor patterns (not eve multi-modal sensory-motor patterns including haptic feedback and vision), but in terms of exosomatic concepts referring to 3-D structures in the environment and their surfaces and relationships, independently of how they are perceived.
Prediction of how a perceived scene would change if an action were applied would take two major steps: first of all deriving the change in the environment, and secondly deriving the effect of that change on the visual and tactile experiences of the perceiver. Among other things that would allow reasoning to be done about objects that are moved using other held objects, e.g. rakes, hammers, and also reasoning to be done about what other perceivers might experience: a necessary condition for empathy.
This is a complex and difficult topic requiring more discussion, but I think the implications for much current AI are deep, and highly critical, since so much work on perceiving and producing behaviour in the environment does not yield the kind of understanding provided by toddler theorems, an understanding that, later on, can grow into mathematical competence, when generalised and articulated.
Having discovered those possibilities an animal, or robot, can play with them, e.g. by trying various combinations of possibilities to find out what happens.
We can play in the environment, and we can play in our minds.
_______________________________________________________________________________
Both kinds of experimentation can increase know-how, and support faster
problem-solving, using patterns that have been learnt and stored. But we
need to account for the differences between learning that is empirical and
learning that is more like deductive reasoning, or theorem-proving. (As in
"toddler theorems" about opening and shutting drawers and doors, or pulling
a piece of string attached to something at the other end.)
____________________________________________________________________________
I noticed a very young child (age unknown, though he could stand, walk, and manipulate a hoop, but looked too young to be talking) playing with a hoop on a trampoline.
He seemed to have learned a number of things about hoops, including
Why it is easier to carry a tray full of cups and saucers using a hand at each side than using both hands on the same side?
Why is it easier with two grasp points than with only one?
___________________________________________________________________________________
For the purposes of research in intelligent robots, we have created an
artificial domain in which humans may have as much to learn as the robots,
and which can start simple, then get increasingly complex: the domain of
polyflaps. See
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/polyflaps
____________________________________________________________________________
It is very unlikely that Sofya has had to learn every possible combination of sensory inputs and motor outputs required to ascend the door-frame. Rather she has (almost certainly) grasped a number of general principles common to classes of states that can arise, using an exo-somatic ontology (i.e. referring not to what's going on inside her skin, but which surfaces are in contact with which and how the contact varies).http://www.youtube.com/watch?v=cij-cT5ZkHo Early, partially successful attempts. http://www.youtube.com/watch?v=FmH8jFLrwDU Fairly expert performance.
She never tries moving both feet up at the same time -- instead always ensuring that two hands and one foot are applying enough pressure to hold her up while she moves the other foot to a higher location. Has she discovered a toddler-theorem about how stable (or nearly stable) configurations differ from unstable ones?
There are also subtle ways in which she adjusts the pressures in order to start sliding down, as opposed to falling down.
It seems that this performance makes use of some learnt generalisations
about how things should feel and some more abstract inferences about how
things should be configured.
____________________________________________________________________________
What about doors?
Added 7 Aug 2013: ROBERT LAWLER'S VIDEO ARCHIVE
Bob Lawler has generously made available a large collection of video recordings of
three children over many years here: http://nlcsa.net/
I have not yet had time to explore the videos in any detail, but I expect there are many examples relevant to the processes and mechanisms involved in discovery of toddler theorems.
The first video I selected at random
http://nlcsa.net/lc1a-nls/lc1a-video/ "Under Arrest"illustrated many different things simultaneously, including how two part-built information processing architectures at very different stages of construction, with an adult out of sight, could interact in very rich ways with each other, some physical some social, and to a lesser extent with the adult through verbal communication. The older child clearly has both a much richer repertoire of spatial actions and a much richer understanding of the consequences of those actions. He also has some understanding of the information processing of the other child, including being able to work out where to go in order to move out of sight of the younger child. However the younger child does not forget about him when he is out of sight but is easily able (thanks to the help of a wheeled 'walker') to alter her orientation to get him back in view.
How a child moves from the earlier set of competences to the later set, is a question that can only be answered when we have a good theory of what sorts of information processing architectures are possible, and how they can modify themselves by building new layers of competence, in the process of interacting with a rich environment -- partly, though not entirely, under the control of the genome, as outlined in Chappell & Sloman 2007).
The ability to be able to model such transitions in robots is still far beyond our horizon, despite all the shallow demonstrations of 'progress' in robot training scenarios.
PRESENTATIONS (PDF)
Maintained by
Aaron Sloman
School of Computer Science
The University of Birmingham
____________________________________________________________________________
.