(VERY EARLY DRAFT: Still changing rapidly, so saved
copies will soon be out of date. Save links instead!)
This paper is
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/cup-saucer-challenge.html
A PDF version may be added later.
A closely related document, using these pictures, was written during the EU CoSy
robot project, and made available here:
http://www.cs.bham.ac.uk/research/projects/cogaff/07.html#708
"Perception of structure: Anyone Interested?"
Two other closely related documents written around the same time are
http://www.cs.bham.ac.uk/research/projects/cogaff/07.html#709
Perception of structure 2: Impossible Objects
http://www.cs.bham.ac.uk/research/projects/cosy/photos/crane/
Challenge for Vision: Seeing a Toy Crane -- Crane-episodic-memory
A partial index of discussion notes is in
http://www.cs.bham.ac.uk/research/projects/cogaff/misc/AREADME.html
____________________________________________________________________________
Question:
Using a two-finger gripper, what actions can get from the situation on the left
(or any situation with partly similar initial relationships) to the situation on
the right (or a similar situation), and back again? Think about how you could do
that, before reading on.
Discussion:
Notice that no known vision system, computer-based or human, can determine exact
directions, distances, curvatures, orientations, thicknesses, and other spatial
properties and relations from these two images, partly because they are low
resolution images taken in poor light (late at night in a hotel bedroom with one
ceiling light not working!), and partly because a single 2-D image cannot in
principle provide exact distances and sizes. So anyone who understands the above
question and thinks about a possible answer must be using interpretations of the
images that abstract from precise metrical details.
Many vision researchers assume that the abstraction has to be done by replacing precise metrical values with probability distributions over such values, but there is another way: using partial orderings to relate parts of the scene rather than absolute values. Orderings can be of many kinds: further away , further apart, wider, more curved, thicker, shallower, sloping more steeply, changing curvature more rapidly in a certain direction, Where processes are involved, again instead of specifying exact directions, velocities, and accelerations in some coordinate system, for many purposes it may suffice to use partial orderings, based on relations like: moving faster than, changing speed faster than, changing direction faster than, rotating faster than, and many more, including comparisons of acceleration (rates, of rates of change).
In some cases instead of processes being described in absolute or relative spatial terms they can be described at an even higher level of abstraction, in terms of changes in affordances that are produced by motion either of things perceived, or of the viewer. This can include changes in proto-affordances: changes in possibilities for motion or changes in possible interactions between things, with no agents' actions or needs being involved.
An extended discussion of opportunities for using partial orderings instead of
probability distributions to deal with uncertainty or poor data, can be found in:
http://www.cs.bham.ac.uk/research/projects/cogaff/07.html#718
Predicting Affordance Changes
(Alternative ways to deal with uncertainty)
Further questions
Earlier you were asked to think about how you might rearrange the objects in
order to get from a configuration like the first to a configuration like the
second. Are you able to describe, not the actions, but how you thought about the
actions, including the intermediate stages and linking processes that you
thought about? Did you need to consider any exact distances, widths, directions,
weights, or other geometric or physical properties or relations?
Did you consider which of your body parts you would use, how the appearance of the scene would change, and what information you would use about the changes when selecting and controlling actions?
Normally we can plan actions without considering those details because we know that we have mastery of the familiar types of sub-task required and this manipulation task is not a difficult test (for a normal adult in our culture), unlike some puzzles that most people find difficult, such as the fisherman's folly puzzle, which requires separation of the metal ring from the rest of the object, without cutting or breaking anything.
Image from:
Pedro Cabalar, Paulo E. Santos, (2011)
Formalising the Fisherman's Folly puzzle, in
Artificial Intelligence, 175, 1, pp. 346--377, 2011,
Issue on John McCarthy's Legacy,
http://www.sciencedirect.com/science/article/pii/S0004370210000408
(That paper shows (a) how the puzzle can be "translated" into a
logical problem, which most humans can't do, and (b) how an AI
planning program can solve it, in the translated form. They make no
claims or promises about automating the translation of the puzzle
into a logical form.)
Returning to the Crockery challenge
Consider how, prior to the action, the agent (one who has not discovered the
translation in Cabalar and Santos) has to, solve several sub-problems.
Could such deliberative premeditation use an action schema (or operator) with approximate, qualitative parameters instead of the more definite actual parameters that would be used (explicitly or implicitly) if the action were performed?
NOTE:
There are problems here partly analogous to problems of reference and
identification in language, except that the mode of reference is not linguistic
and what is referred to typically cannot be expressed in language because it is
anchored in non-shared structures and processes.
(Internal 'attention' processes are partly like external pointing processes: virtual fingers -- in some cases because they exhibit 'causal indexicality', i.e. implicitly referring to the results of learning, or selective attention, achieved by some internal learning mechanism, as pointed out in:
http://www.cs.bham.ac.uk/research/projects/cogaff/03.html#200302 Aaron Sloman and Ron Chrisley, Virtual machines and consciousness, Journal of Consciousness Studies, 10, 4-5, 2003, pp. 113--172, NOTE: A detailed commentary (and tutorial) on this paper by Marcel Kvassay, comparing and contrasting our ideas with the anti-reductionism of David Chalmers, was posted on August 16, 2012: http://marcelkvassay.net/machines.php
Maintained by
Aaron Sloman
School of Computer Science
The University of Birmingham