Some of this text is derived from pages 10-13 of Sloman(2016), which is partly based on Chapter 8 of Sloman (1978).
A partial index of discussion notes on this web site is in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/AREADME.html
The ideas are not new: every point made here has (I suspect) been noticed already by some thinkers, though I am not aware of any attempt to bring them together.
The list of section headings indicates the scope of this document.
If a row of boxes contains N boxes and there are N+1 balls, each of which is in one of the boxes how can you know that at least one box must contain more than 1 ball?
Different kinds of answer will be accepted as adequate or good, by people with different backgrounds and expertise, e.g. a primary school mathematics teacher, various sorts of mathematical researchers, computer scientists (especially those working on automatic reasoners/theorem provers), logicians, philosophers of mathematics of different kinds, and various sorts of researchers on mathematical cognition in psychology or neuroscience. I am not sure whether educational theorists will fit into one of these categories or offer different answers from any of the others.
I don't think any form of reasoning about numbers based on purely formal axioms and rules corresponds to "direct" understanding of most 9 year olds -- though there may be exceptions (e.g. Escardo offspring?) That form of understanding wasn't available to Euclid and his predecessors either. The formal, rigorous, axiom-based, interpretation is at most a couple of hundred years old (though anticipated earlier by Leibniz (1685: "...this language will be the greatest instrument of reason, ... when there are disputes among persons, we can simply say: Let us calculate, without further ado, and see who is right" -- note the future tense "will be": he did not claim to have found such a language. Boole's and Peano's work came nearly two centuries later). Peano's axioms were published in 1889 https://en.wikipedia.org/wiki/Giuseppe_Peano Earlier, Kant (around 1781) pointed out that knowledge is not merely used in linguistic expressions and thoughts. There are also contents of perception, for example. Although linguistic and perceptual competences and mechanisms develop in parallel in humans, the more basic perceptual competences are evident in pre-verbal humans and many other intelligent animals that don't use communicative languages, though it's arguable that they use forms of representation suited to perception and spatial reasoning, as shown in intelligent choice of actions -- e.g. in squirrels, crows and pre-linguistic humans. In humans, those ancient forms of representation continue to be used in parallel with the growing use of (linear, discrete) linguistic forms of representation. E.g. humans produce gestures, maps and blueprints as well as sentences and other discrete, linear, forms of representation. A claim that only the latter are suited to valid reasoning is just a recent dogma (accepted by some philosophers and mathematicians, but not all -- it was rejected by Kant, but he noted that he could not adequately explain what he was takling about ... fearing that it will "forever lie concealed in the depths of the human soul", though perhaps he might have welcomed AI as source of new explanations). Meanwhile, mathematicians have had a deep enough understanding of numbers to make important discoveries about them (e.g. Euclid's proof that there are infinitely many primes) and the discovery (via geometrical reasoning) that rational numbers are insufficient for measures of linear distances, long before arithmetic was axiomatised in what would now be regarded by some mathematicians as a rigorous way. I seem to recall that the Archimedean property of numbers was originally discovered in relation to spatial lengths, i.e. given any two items, there is a finite multiple of the smaller one that exceeds the larger one. (However, the original concept of multiplication of lengths is different from the concept of multiplication of numbers.) Interestingly, Peano's formalisation of a subset of arithmetic, using undefined symbols in an axiomatic system, was conceptually totally different from the attempts by Frege and Russell to provide a rigorous foundation for number theory in which concepts of number and numerical operations are all *explicitly* defined in terms of purely logical (non-numerical) concepts applicable in percepts and thoughts about the world, instead of being undefined symbols in a set of axioms. (Although Frege claimed to have shown that arithmetic is a subset of logic, he did not think that logic could suffice to define the concepts of geometry, and did not accept that Hilbert's axiomatisation of Euclid could be described as a foundation for geometry.) Likewise, I don't think Frege (and Russell, etc.) provided accurate analyses of the pre-existing concepts of number, going back to Euclid and earlier. Human uses of number concepts almost certainly developed much earlier from deep discoveries relating to the roles of one-to-one correspondences in solving practical problems -- including bartering, obtaining enough items of food for the family, comparing lengths in different locations in terms of multiples of a small standard length, planning times and provisions for journeys, providing distinct locations for objects or people, or activities, and perhaps using numerically labelled lengths when making clothing or coverings to fit objects, etc. In that case, Euclid's understanding of numbers, was based on several such conceptual towers developed over centuries before him. My comments on the Pigeon Hole problem are also based on a rejection of any purely formal analysis of our number concepts. (Such a formalisation, produces a new branch of mathematics with interesting structural relationships to previous understanding of numbers and their uses in counting, measuring, etc.) I had not previously encountered Young diagrams, before Steve mentioned theem, and I can see vaguely how they might provide yet another mathematical structure with strong relationships to the natural numbers, but it's not remotely plausible that they underlie ancient numerical competences (or the number competences of typical 9-year olds).
On the other hand, if we compare the tasks current formal reasoning systems are good at with the achievements of ancient human mathematicians it is clear that there is still a huge gap. The ancient mathematicians did not know know about, and were not explicitly using, the formalisms, axioms, and inference rules now used by automated reasoning systems, based on developments in the nineteenth and twentieth centuries.
I know of no evidence to suggest that any biological brains were unwittingly using those structures and mechanisms thousands of years ago when ancient mathematicians made profound discoveries in geometry, topology and arithmetic.
I don't think anyone could seriously claim that human brains do use such logical mechanisms unwittingly -- just as they have always used molecular and neural mechanisms unwittingly, some of which were discovered by human scientists only recently. I might be proved wrong by evidence showing how animal brain mechanisms were above to implement logical deductive systems before they were discovered and communicated explicitly.
I know of no physiological evidence suggesting that brains contain and use large stores of logical formulae and logic-based mechanisms for deriving consequences from them as AI theorem provers do, although there is evidence that some professional mathematicians and logicians have learnt to implement such theorem provers in their brains, usually with the aid of external memory stores for intermediate results of calculations or derivations, e.g. on sheets of paper with pencils or pens, blackboards+chalk, whiteboards+coloured pens, or most recently in computer packages accessed via electronic interfaces such as screens, keyboards, computer mice, light-pens, and touch-pads.
There does not seem to be quite so much consensus regarding whether the concepts of geometry (point, line, circle, length, area, angle, curvature, etc.) are fully defined (implicitly) by axioms for geometry, as proposed by Hilbert (1899) (a proposal strongly contested by Frege -- see Blanchette(2014)), or whether our understanding of geometry and its topological core, builds on our non-logical, pre-mathematical experience of and activities using objects and processes situated in space and time.
A further complication is that (in my experience) most discussions of foundations of mathematics ignore geometry and topology, as does most research on mathematical cognition and its development or neural underpinning.
However in the case of both arithmetic and geometry, long before the development in the last few hundred years of precise formal specifications, there were previously discovered and partially explored mathematical domains, understood by ancient mathematicians who made deep discoveries in both geometry and number theory without any knowledge of modern logic and set theory.
So what they were investigating could not depend on or be defined in terms of intentional use of modern axioms and inference rules.
(The use of human languages, which include rich, multi-layered mathematical structures that are not yet fully understood, is another example of ancient mathematical cognition that is not normally classified as mathematical.)
In principle, it is conceivable that an ancient number theorist or geometer might have had a brain that, unknown to the individual, used the kinds of logic developed in the 19th and 20th century, including boolean connectives, quantifiers, modal operators, etc., but, as far as I know, there is not a shred of evidence that animal brains did that before human logicians and mathematicians discovered and deployed modern formal logic a short time ago.
That raises the question: if formal logic, and formally specified abstract structures, were not used in past millennia, what alternative forms of representation and reasoning produced by biological evolution, and possibly supplemented by cultural evolution, could have made ancient mathematical discoveries possible?
At present I don't think anyone knows which biological brain mechanisms make all that possible although there are many implementations of logical mechanisms using digital computers that are nothing like animal brains in their physical construction or their normal functioning -- at least before the 19th century!
There may be logically specifiable domains that are isomorphic with what the ancient mathematicians were
studying, but they are not the same things. Ancient mathematicians clearly had
ways of identifying what they were thinking, talking, and writing about,
independently of modern logical techniques, even if there are many unanswered
questions regarding the mechanisms they used and the criteria they accepted --
e.g. when deciding whether geometrical reasoning should include the
neusis construction, known to Archimedes and others, which
makes it easy to trisect arbitrary angles, as explained in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/trisect.html
(or pdf).
Modelling is one thing: replicating another. Modern "foundational" attempts to characterise or model ancient abilities to think about the natural numbers (or more generally arithmetic) and the related, though less well developed, attempts to characterise and model Euclidean (and non-Euclidean) geometrical reasoning (spearheaded by David Hilbert (1899)), both fail to identify the subject matter of the ancient mathematical discoveries, and instead introduce new, partly related, but different, mathematical domains, based on logic (and/or set theory), arithmetic and algebra, unlike the original mathematical domains discovered using forms of cognition that I do believe anyone understands yet.
Moreover, I'll try to show how it can be argued that the arithmetical discoveries made by Euclid and others long before the discovery of modern logic were more like discoveries in geometry and topology than like proofs in an axiomatic system using only logical inferences. However, arithmetical knowledge of the kind developed and used by Euclid and other ancient mathematicians is not concerned only with spatial structures and processes. Ancient arithmetic was concerned with general features of groups or sets of entities, and operations on them, whether they are spatial, temporal or abstract. For example, the number of even numbers between any two odd numbers could be counted, and was obviously, necessarily, non-zero (why?).
In particular, acquiring the ancient concept of the number of items in a collection requires having the ability to relate different collections of objects using one-to-one correspondences (bijections) between members of those collections. So the basic idea of arithmetic is that two collections of entities may or may not have a 1-1 relationship. If they do we could call them "equinumeric".
Understanding that, and making use of various examples of equinumerosity to solve practical problems, perhaps as diverse as getting enough food items to feed one's family, matching heights of pillars made from hewn blocks, and performing trading agreements, could have, and probably did, precede any grasp of a systematic way of naming different numerosities. (For more on ancient mathematics see Hogben (1968).)
The following three groups of items are equinumerous in that sense (treating different occurrences of the same character as different items).
group1: [ U V W X Y Z ] | | | | | | group2: [ P P P P P P ] | | | | | | group3: [ Omega Psi Psi Gamma Theta Pi ]
Note on ordinal numbers
There is a rarely acknowledged complication regarding instances reappearing that
is crucial to the difference between cardinals and ordinals. If the people who
visit your office do so in the following order:
Andrew Basil Carol Andrew Daphne Edmund
Another complication allowed in ordinary parlance would be for two people to come at the same time, in which case they might both be third. Such complications could be crucial to unravelling a murder mystery, for example, but will not be pursued here.
A slightly different complication is allowed in many sporting events, which do not allow the same individual to occur in two positions in the results, but do allow two individuals to occupy the same position, with the proviso that the next position is empty. E.g. if Basil and Carol tied for second place, the results could be as followed, with nobody in third place:
1st: Andrew 2nd:{Basil, Carol} 3rd: 4th: Daphne 5th: Edmund
Yet more examples come from cyclic ordinals e.g. days of the week, months of the year, individuals sitting around a table.
These examples indicate that the concept of an ordinal structure, as used in everyday language and thought, is more complex than the concept of a cardinal.
That fact is not reflected in the mathematical theory of ordinals, as far as I am aware, since it does not allow duplicates, ties or cycles!
Repeated elements were not allowed in the original theory of ordinals due to
Georg Cantor, summarised here
https://en.wikipedia.org/wiki/Ordinal_number
That theory allowed an ordinal to be associated with any well ordered set
(ordered set such that every subset has a least element). A particular class of
ordinals could be formed by rearranging the natural numbers in the following
way:
Start with all multiples of 2, then all multiples of 3 not already included,
then all multiples of 5 not already included, etc. and so on adding new
multiples of prime numbers. This will produce the following structure, which has
no repetitions:
Example Order1:
2,4,6,8,... 3,9,15,21,... 5,25,35,55,... 7,49,77,91,... 11,121,143,...
(Try to work out the pattern.)
However, if we start with all multiples of 2, then all multiples of 3,
then all multiples of 5, etc. and so on adding for each prime all of its
multiples whether they occur earlier or not, then instead of the above, we'll
get a structure that does have many repetitions:
Example Order2:
2,4,6,8,10,12,14... 3,6,9,12,15,... 5,10,15,25,... 7,14,21,28,... 11,22,33,44...
If this is done for each prime number (2,3,5,7,11,13,17, etc.) the result is a
well defined infinite ordered structure composed of an infinite sequence of
ordinals that are less sparse than the previous sequence, because numbers that
are products of two or more different prime factors (e.g. 6, 10, 12, 14, etc.)
will be repeated instead of occurring only once.
A different well-ordered infinite structure with repetitions could be defined as
follows:
Example Order3:
all the positive integers in order
followed by all the powers of 2 in order
followed by all the powers of 3 "
followed by all the powers of 4 "
followed by all the powers of 5 "
followed by all the powers of 6 "
....
Because this is not restricted to powers of prime numbers there will be
repetitions.
I have no idea whether Cantor, or anyone else, has ever previously thought of extending the transfinite ordinals to allow the same kinds of repetition as the common sense ordinals do as described above in connection with the ordering of visitors to a room.
It's possible that there are no interesting new facts to be derived from such structures, though they could provide interesting exercises for students, as there are well defined mathematical questions whose answers are different for the different cases presented above. For example, what can be said about the first two items of each new sub-ordinal in Order1, above?
The relation of equinumerosity has many practical uses, and one does not need to know anything about names for numbers, or even to have the concept of a number as an entity that can be referred to, added to other numbers etc. in order to make use of equinumerosity.
In that sense, as I think David Hume and others have pointed out, the concept of equinumerosity is more fundamental than the concept of number. (It is arguable there are many other examples of concepts like this, e.g. if the concepts of sameness of height, area, weight, etc. are presupposed by concepts of height, area, weight, etc.)
Our intelligent ancestors might have discovered ways of streamlining that cumbersome process: e.g. instead of bringing each fish-eater to the river, ask each one to pick up a bowl and place it on the angler's bowl. Then the bowls could be taken instead of the people, and the angler could give each bowl a fish, until there are no more empty bowls, then carry the laden bowls back.
What sort of brain mechanism would enable the first person who thought of doing that to realise, by thinking about it, that it must produce the same end result as taking all the people to the river? A non-mathematical individual would need to be convinced by repetition that the likelihood of success is high. A mathematical mind would see the necessary truth, and understand that repeated experiments to establish the validity of the method were a waste of time. How?
(There could be other reasons for doing experiments, e.g. to see whether fish could be made to remain in their bowls, or whether one person could carry all the fish-laden bowls, etc.)
What brain mechanisms make possible that sort of understanding using transitivity of 1-1 correspondence?
Of course, we also find it obvious that there's no need to take a collection of bowls or other physical objects to represent individual fish-eaters. We could have a number of blocks with marks on them, a block with one mark, a block with two marks, etc., and any one of a variety of procedures for matching people to marks could be used to select a block with the right number of marks to be used for matching against fish.
Intelligent anglers or hunter-gatherers could understand that a collection of fish (or plums, etc.) matching the marks would also match the people. How?
We also know that it is not necessary to carry around a material numerosity indicator: we can memorise a sequence of names and use each name as a label for the numerosity of the sub-sequence up to that name, as demonstrated in Chapter 8 of Sloman (1978).
It was also discovered long ago that instead of memorising a fixed collection of names to use for that purpose, we can adopt a generative procedure for producing arbitrarily long sequences of names (though some procedures -- e.g. for Arabic numerals -- were more elegant and powerful than others -- e.g. for Roman numerals).
A human-like intelligent machine would also have to be able to discover such strategies, and understand why they work. This is a deeper cognitive challenge than simply learning to assign number names to groups of objects, events, etc.
It is also totally different from achievements of systems that do pattern recognition, treating number labels as labels for patterns. Perhaps studying intermediate competences in other animals will help us understand what evolution had to do to produce human mathematicians.
Piaget 1952
Piaget's work showed that five- and six-year old children have trouble
understanding consequences of transforming 1-1 correlations, e.g. by stretching
one of two matched rows of objects Piaget(1952).
When they do finally grasp the necessary transitivity of a 1-1 correspondence relation between collections of objects, have they found a way to derive it from some set of logical axioms using explicit definitions? Or is there another way of grasping that if two collections A and B are in a 1-1 correspondence and B and C are, then A and C must also be, even if C is stretched out more in space?
I suspect that for most people this is more like an obvious topological theorem about patterns of connectivity in a graph rather than something proved by logic. Moreover, insofar as this insight does not seem to be present in much younger infants, any claim that they have this concept of number from birth or soon after is clearly false. A full grasp of the natural numbers, is a product of products of evolution combined with cultural evolution and various kinds of individual creativity, but not a product of natural selection built directly into the genome.
In the terminology of Chappell & Sloman (2007), the competences are "Meta-configured", i.e. specific instantiations of a more generic (but still unidentified) genetically specified meta-competence. See also 'The Meta-Configured Genome'
The key ideas are explained (perhaps more clearly) in this 6 minute video https://youtu.be/G8jNdBCAxVQ.I think this notion is closely related to Kant's ideas about knowledge that is neither empirical nor innate, but rather something like an empirically triggered instantiation of a genetically specified meta-competence.
Linguistic capabilities are similar, but more complex. The human genome does not specify any one of several thousand human languages. But without mechanisms specified by the genome none of those languages could have been developed.
But why is the transitivity obvious to adults and not to 5 year olds, or younger children? Anyone who thinks it is merely a probabilistic generalisation that has to be tested in a large number of cases has not understood the problem, or lacks the relevant mechanisms in normal human brains.
Does any neuroscientist understand what brain mechanisms support discovery of such mathematical properties, or why they seem not to have developed before children are five or six years old (unless Piaget asked his subjects the wrong questions)?
NOTE: Much empirical research on number competences grossly over simplifies what needs to be explained, omitting the role of reasoning about 1-1 correspondences.
It would be possible to use logic to encode the transitivity theorem in a usable form in the mind of a robot, but it's not clear what would be required to mirror the developmental processes in a child, or our adult ancestors who first discovered these properties of 1-1 correspondences. They may have used a more general and powerful form of relational reasoning of which this theorem is a special case.
The answer is not statistical (e.g. neural-net based) learning. Intelligent human-like machines would have to discover deep non-statistical structures of the sorts that Euclid and his precursors discovered.
The machines might not know what they are doing, like young children who make and use mathematical or grammatical discoveries. But they should have the ability to become self-reflective and later make philosophical and mathematical discoveries.
I suspect human mathematical understanding requires at least four layers of meta-cognition, each adding new capabilities, but will not defend that here. Perhaps robots with such abilities in a future century will discover how evolution produced brains with these capabilities \cite{sloman-tt}.
Close observation of human toddlers shows that before they can talk they are
often able to reason about consequences of spatial processes, including a 17.5
month pre-verbal child apparently testing a sophisticated hypotheses about 3-D
topology, namely: if a pencil can be pushed point-first through a hole in paper
from one side of the sheet then there must be a continuous 3-D trajectory by
which it can be made to go point first through the same hole from the other side
of the sheet:
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/toddler-theorems.html#pencil
(I am not claiming that my words accurately describe her thoughts: but clearly her intention has that sort of complex structure even though she was incapable of saying any such thing in a spoken language. What sort of internal language was she using? How could we implement that in a baby robot?)
Likewise, one does not need to be a professional mathematician to understand why when putting a sweater onto a child one should not start by inserting a hand into a sleeve, even if that is the right sleeve for that arm.
Records showing 100\% failure in such attempts do not establish impossibility, since they provide no guarantee that the next experiment will also fail.
Understanding impossibility requires non-statistical reasoning.
Accordingly, many AI/Robotic researchers now design machines that learn to perform tasks, like lifting a cup or catching a ball by making many attempts and inferring probabilities of success of various actions in various circumstances.
But that kind of statistics-based knowledge cannot provide mathematical understanding of what is impossible, or what the necessary consequences of certain spatial configurations and processes are. It cannot provide understanding of the kind of reasoning capabilities that led up to the great discoveries in geometry (and topology) (e.g. by Euclid and Archimedes) long before the development of modern logic and the axiomatic method.
I suspect these mathematical abilities evolved out of abilities to perceive a variety of positive and negative affordances, abilities that are shared with other organisms (e.g. squirrels, crows, elephants, orangutans) which in humans are supplemented with several layers of metacognition (not all present at birth).
Spelling this out will require a theory of modal semantics that is appropriate to relatively simple concepts of possibility, impossibility and necessary connection, such as a child or intelligent animal may use (and thereby prevent time-wasting failed attempts).
I presented a partial theory in my DPhil thesis Sloman(1962) summarising and defending what I thought Kant was saying (or should have been saying!).
E.g. if two solid rings are linked it is impossible for them to become unlinked through any continuous form of motion or deformation -- despite what seems to be happening on a clever magician's stage.
This form of modal semantics, concerned with possible rearrangements of a portion of the world rather than possible whole worlds was proposed in Sloman(1962). Barbara Vetter (2011) seems to share this viewpoint.
Another type of example is in this figure, derived from Oscar Reutersvard's 1934 drawing of an impossible configuration of cubes:
What sort of visual mechanism is required to tell the difference between the possible and the impossible configurations. How did such mechanisms evolve? Which animals have them? How do they develop in humans? Can we easily give them to robots?
How can a robot detect that what it sees depicted is impossible?
Richard Gregory demonstrated that a 3-D structure can be built that looks exactly like an impossible object, but only from a particular viewpoint, or a particular line of sight.
Maintained by
Aaron Sloman
School of Computer Science
The University of Birmingham