Brownian thought space: May 2006

English is not Pro-Drop

It's probably the living in Italy; I'm most sensitive to the Pro-Drop parameter. Basically, there is a difference in languages of the world that has an impact on several structural properties. This Pro-Drop parameter refers to the fact that in some languages like Italian, one can say merely "Piove" to mean "(It's) raining". In English you need the overt subject even if it doesn't stand for anything. Which, by the way, is a great example for a purely abstract, structural requirement in language. Anyhow, here is a passage from Alice's Adventures in Wonderland by that amazing Lewis Carroll.

`--I proceed. "Edwin and Morcar, the earls of Mercia and Northumbria, declared for him: and even Stigand, the patriotic archbishop of Canterbury, found it advisable--"' `Found WHAT?' said the Duck. `Found IT,' the Mouse replied rather crossly: `of course you know what "it" means.' `I know what "it" means well enough, when I find a thing,' said the Duck: `it's generally a frog or a worm. The question is, what did the archbishop find?' The Mouse did not notice this question, but hurriedly went on...

Physical Expectation vs Social Expectation

Or, Baillargeon vs. Csibra

Here's the question: at ages when infants both have physical expectations as well as can use social (pedagogical) cues, is there a period when one cue takes precedence over the other? And here is a way of testing it. This is a task that infants can do: (ref?):

Basically, there is a rubber ducky on the table, which has an occluder. A cup covers the ducky (at point A), moves/slides behind the occluder, the cup lifts off, goes past the occluder and comes down at point B. At this point, where should the rubbber ducky be? Babies think that the rubber ducky must lie behind the occluder at point C. Now for the variation: first, there are two occluders on the table.

The cup comes down, slides behind occluder 1 at point C, lifts up, goes behind the occluder at point B, and then the cup lifts up and goes away to leave just the two occluders. This so far was the Renee part; now for the Csibra part. Imagine, that a human observer gazes excitedly behind the occluder at position B.

(ok, so she doesn't look terribly excited.. I just got that somewhere from the Google image search) According to the Gergo line, the infant should expect the object at position B. But, according to the Renee line, the infant should expect the object at position C. What might actually happen when the occluders came down and the ducky was at position C vs. position B? First, given the Baillargeon results, one might expect, given the Gergo theory, that the infants show more surprise if the experimenter looked at the position B vs the position C. This would be the complementary experiment to the previous Gergo experiment, and would tie in physical expectation with social expectation. So, what this would show is not only that infants expect there to be objects where they look, but they also expect humans to look where there is an object. This would be a nice validation of the pedagogical stance. Next, one can see what happens when the occluders are dopped; there are two possibilities: 1) The object is where the experimenter was looking (position B). 2) The object is where it is supposed to be, given physical constraints. The question is: in which condition would the infant look longer? If it looks longer when the object is at position B, it would imply that the physical rules win. However, if it looked longer when the object is at C, it would mean that, given the social cue, the infant had updated its object file representation, and now expected an object at B.

Sterkens: Trap40

Trap40 is a very very nice strong pale ale from Belgium. There is something about Belgian beers.. I think I like them all! This one has a wonderful fruitysweet taste; although it tends to get a bit bitter in the aftertaste. Also, at 8% alcohol, it's a good thing it comes in a 33cl bottle ;)

Chandler

Chandler Burr is a very nice person. Also, by some funny fatequirks, he's a good friend. Just came across the finding (again!) that his story was chosen as a notable story of the year by the Million Writers Awards. The reason I'm particularly chuffed is because I read old drafts of the story :) I hope he keeps sending me drafts! He is definitely a fun writer. The story itself is for reading at the Narrative Magazine, a free online source. He has some secret projects, one of which I'm very very eagerly looking forward to, but which just seems delayed forever. Get off your ass!!!! He's also the writer of two good books: A Separate Creation and The Emperor of Scent (translated into Italian as well..)

How do we know we use symbols?

{This is Gary.} In the last post I was wondering about the categorization issues raised by a paper. Halfway along, I spoke about another paper that went bad (Clarke & Thornton, 1997, BBS {C&T}). In one of the responses to the target article by C&T, Gary Marcus explains how Elman networks (backpropogation, hidden units, momentum etc etc) generally cannot really do things that are hallmarks of the kinds of stuff that we can do, and the example is worth considering. Gary considered the following: imagine I said to you "a rose is a rose", or "a duck is a duck"; what would you reply to "a dax is a ___"? If you thought of anything besides "dax", your neurons probably look like little points with arrows sticking in an out, reminiscent of Toshirô Mifune towards the end of Kurosawa's Throne of Blood. What Gary saw was that with many many versions of an Elman network, the network was simply unable to generalize the "a (__) is a (__)" pattern. This is because a network operates over the input set, looking for correlations and the like; usually it cannot abstract well to stuff that is outside the input stimulus set. {Rider: certain patterns it can of course generalize; these typically lie inside the training set, in the sense that they can be arrived at by intrapolation}. Which reminds me very much of some of the things that Fodor says are non-negotiable for a proper theory of the mind. One of them is something similar: systematicity. Systematicity means that if I can say "John kisses Mary", then I can as easily say "Mary kisses John". It is as if the verb kisses is surrounded by two slots, which can be filled by the kisser and the kissee.

Back to categories

Why is the issue of categories interesting? Because, just maybe, categories that we form might tell us something about how we acquire concepts. Somehow I find concept acquisition just too bothersome. There's just NO nice answer for concept acquisition. As mentioned previously, even the nice Relevance Theory comes only after the concepts are acquired. But, from the Waldmann & Hagmayer paper, one can see that categories seem to enter into the kinds of propositions that concepts can occur in as well. So, the participants in those experiments appear to make propositions like "A causes disease", where A is some category formed in the training phase. So, is concept acquisition anywhere in site, howsoever remotely? No. Just because we categorize things and use category labels in forming propositions doesn't make the categories into concepts. Remember that concepts include those that are phrasal in nature: categories can be seen as simply reflecting those 'concepts' that have a complex internal structure that reflects, via propositions, the statistical structure in the input. So, A might be mentally encoded as BRIGHT-VIRUSES-CAUSING-SPLENOMEGALY; a perfectly propositional structure; reflecting the observed correlation between Bright and Splenomegaly. This sounds no different from how, in Relevance theory, we form essentially phrasal concepts on the fly (like PAIN*); again leaving the problem of concept acquisition essentially a bloody mystery. Coming up: The difference between rules and Gestalts

Categories and causality

Nice paper: Waldmann, MR & Hagmayer, Y (2006) Categories & causality: The neglected direction, Cognitive Psychology, 53, 27-58. Specially nice is the introductory part, in which they discuss some of the ideas about categories and causality. Here is the example: imagine tokens, A, B , C & D. If you see that A & C are always followed by some effect E. Then the regularities can be summarized depending on how you cut up the world. If you see A and C as tokens of the same category C1, then you will say that C1 causes E. But if you cut up the world so that A and B are the same category C1 (while C & D are category C2), then both C1 and C2 predict E with equal likelihood, so there is no information, and no reason to believe any kind of causal stories between categories and events. Notice that if there are (statistical) regularities that group A & C together (say both are brightly coloured, while B & D are dull coloured), then this is not a problem anymore. {Sidetrack: For a paper which discusses this in a sort of ok way but then goes horribly wrong, see the BBS article: Clark, A & Thornton, C (1997). Trading spaces: Computation, representation and the limit of uninformed learning. Behav. Brain Sci., 20, 57-90} One way of looking at the results of this paper is to say that if there are such dimensions, and if you have learnt to classify based on the brightness dimension, then you might use these brightness-based groups to draw causal inferences (the bright ones do well at college, for example ;). But now, imagine that A & B are large objects, while C & D are tiny objects. And now someone comes along and says that A & B cause problems. How readily would you conclude that large objects cause problems? Not very easily, find the authors. Looks like you get stuck with the original categories, and so cannot quite 'see' the link between a different dimension of categorization and its causal inference. Turns out that if you believe that the original dimension of classification was a 'natural kind'; so something that existed out-there, so to say (like colours and brightnesses and sizes), as opposed to something someone made up; then you are much less willing to give up the natural categorization than the made-up one. What strikes me here is that nowhere is there the mention of the Wisconsin Card Sorting task. Essentially, card sorting in the Wisconsin task requires users to sort according to a certain rule (say colour), and then, at some point, to switch over to, say, shape. In 'normal' adults, this is supposed to be do-able. So, we can set-shift along different dimensions. Could it be that this is due to the artificial nature of the categrizaton schemes? So, does the adult simply think that these are not natural categories, so they can switch easily? Would it be harder with natural categories? And here is something else. What about patients who do sort correctly, but then persevere. Could it be that atleast in some cases, this is not due to the executive dysfunctioning, but due to the fact that the patient cannot separate natural from artificial categories? And why is the whole thing interesting anyway? Hmm. I think that would need a separate post. Starting with Gary.

Badger: Tangelfoot

Actually I got this beer for its name. And here is the history of the name from the bottle label:

Many years ago the head brewer hosted a tasting to coin a name for his new ale. Several tankards were consumed and on rising to go he experienced a sudden loss of steering and so unwittingly fell on the perfect name for this legendary ale.

The beer itself is a bit heavy, but rounded-bitter, so right up my alley :) Golden coloured, despite the flavour, alcohol: 5%. The website of the Badger breweries even has on the site some nice recipes with the beer as well! The most promising sounds the Meaty Casserole

Homology in the mind

Here is an attempt at trying to spell out the logic of mental modules, and what they might mean, through something more apparent: anatomy. Ok, so those of us that believe in evolution, quite like homologous organs. Essentially, several people (including Darwin) saw in the anatomy of animals, certain common forms.

In this pic, homologous bones are painted in like colors. And you don't need terribly lots of imagination to see that there are similar (same) number of bones, which are arranged in similar ways, with obvious differences. The point is that these differences reflect the different functions of the organs (hands, paws, flippers, batwings).

Lesson 1:

Although different animals might share similar underlying elements, this alone cannot explain the function in the different animals This point, translated into Cognitive Science would run something like this: simply finding some basic computational mechanism in two animals does not necessitate that they are used in like ways and for like functions. This is essentially the crux of some of the arguments in Pinker & Jackendoff, part I (amongst many others). But there is something else. If you think about the human hand you see above; there is another organ that looks way more similar; here is a pic of the human hand and the human foot:

Just looking at the picture of the bones, its quite plain that there are a great deal of similarities. In fact, at structural level, one might say that, by and large, the hand and the foot are made up of pretty similar things. But, from this, if you were to draw the conclusion that the hand and the foot did very similar things, you would be in error. So:

Lesson 2:

Even within a single organism, structural similarities between organs do not readily translate into functional equivalences for what the organs do. In CogSci terms: just because two domains (e.g. vision and audition) share underlying computations, does not mean that they do functionally similar things. This view has been put forth by various people, including (partial list): Randy Gallistel, Pinker & Jackendoff part II, Gary Marcus (and me!). In computer science terms: just because a PC and a Mac both use silicon-based transistor chips, doesn't make them equivalent ;)

New Watch :)

I really tested the patience of the Swatch salesgirl yesterday, but the result was most satisfactory :)

Theresianer - Premium Lager

Here's another from the Theresianer Alte Brauerei (that's the proper name). This is the

Theresianer: Premuim Lager

It's a pretty drinkable (4.8% alcohol), crispy lager. Standard (good) stuff.

Theresianer - Vienna

{Interesting beer labels..}

Theresianer: Vienna

The Theresianer brewery, as you can see, is an 'Antica Birreria' (Ancient brewery), Trieste, 1766. Theresianer is the latin name for Trieste. This beer is special because it won the 2006 gold medal of the Deutsche Landwirtschaftliche Gesellschaft (DGL), [German Agricultural Society]. If I remember right, this brewery was the first Italian brewery to win the DGL gold. The beer is a copper-coloured, sweetish lager from Hapsburg tradition, made from smoked Vienna malts. Alcohol content: 5.3%

Socially Stable Strategies?

Am reading Freakonomics... and of course, I'm not convinced :) The essential point of freakonomics so far (3 chapters down) seems to be that somehow economics has the tools for the answers to pretty strange questions. Like, what's common between sumo wrestlers and high school teachers? And then there is some reasoning about this and that and some clever way of getting actual data and doing some clever analysis. [pic credits: freakonomics.com] But why economics?? Some of the things, like selling bagels or selling crack in Chicago a few decades ago does have a clear economic angle. Others, like sumo wrestlers have a less clear (at first pass) economic angle, and some (dating) have very little. So what's the correct generalization here?

Socially Stable Strategies - SSS

A truism:

Nothing in Biology makes sense except in the light of evolution -(Theodosius Dobzhansky [bah! the Dictionary cannot spellcheck this]).

Some of the recent literature has started looking towards biology for ideas about why we make the 'economic' choices we do.

Hypothesis: 'Economics' is a special case of 'Resource Management' in Eco-Evo*

(*Eco-Evo being the Ecologically situated Evolutionary theory.. [does this actually exist??]) I think a better way of seeing everything that Freakonomics examines, and much more and in a better light is to consider economics as a special case of resource management that all creatures must figure out in some way or another. Take the curious facts that we do not behave in a 'rational' manner under certain circumstances (e.g. Comsides & Tooby). I'm not sure that this can be explained easily in an economic theory. Put it differently: without knowing the biology of the animal and factoring that in, their 'economic' behaviors will not make sense. So why SSS? Simple: comes from the ESS of John Maynard Smith (and others). Instead of Evolutionarily Stable Strategies, the SSS is supposed to reflect the fact that at different time-scales (evolutionary vs social), there might be differences in strategies.

More Learning in Relevance Theory

Really, what is to be learnt in Relevance Theory? If the base concepts are in place, and (mental) syntax is in place, what else remains? Maybe nothing? in which case, language variation is just a trivial sociological issue? Needs more pondering.

easy post

Easy post for dear Pinak

hello.

Jigsaw, Part 3: Baillargeon

Renee Baillargeon has the most amazing memory of anyone I've met. She remembered me from a brief meeting nearly 5 years ago, when I was just starting my whole Ph.D. thing with Jacques. This Cognitive Jigsaw is about how some ideas that I'd heard about can be put together. So far, I've gotten up to [[Sperber+Csibra]+Aslin]. Now for Renee. Anyhow, she gave a very clear picture of how she sees physical reasoning develop in infancy. The basic idea is that there is core knowledge, which is used to interpret a scene. The scene is built out of Basic Information (is that a tube or a cylinder? Is something inside something or behind it?) and Variable Information (height of an occluder, the colour of an occluded object). Variable Information is the tricky bit. Essentially, it seems that there are innate biases (Basic Information), and then the baby has to learn that some other sources of information (like height, width, colour, transparency) also need to be taken into account. The really funny thing is that, imagine a kid has figured out that a tall object cannot be

out-of -sight when it is placed inside a shorter cylinder (this happens around 7.5 months of age). Then, at this stage, the kid will NOT have figured out that the same is true also for tubes! {In the figure, you see a tall-container and a short-container event, where a container is a cylinder. Click the image for the article}. Renee et al interpret these findings to say that different events are encoded differently, and figuring one out expands to all instances of the same event, but not to different events. So, if the kid figures out containment in cylinders, it will know containment of any kind of object in any kind of container, thus generalizing over cylinders, but will NOT generalize to tubes, which it will represent as a separate event in it's wee mind. So the representation in the kids mind is something like:

{Contains(Cylinder_i, Object_j} ...1

..abstracting over different cylinders and different objects in different conditions. Under these conditions (and not necessarily any other), the child will make an inference of the kind

[If(IsShorter(Cylinder_i,Object_j)) Then IsVisible(Object_j) Else IsInvisible(Object_j)] ...2

...if you get my drift.. :) The point was that {Contains(Cylinder_i, Object_j} is not the same as {Contains(Tube_i, Object_j}. So, while the infant might apply rules like 2 to the former, it might not to the latter. So what does this have to do? Well, to get to that, one more piece is needed: at an age when an infant will NOT attend to height in a tube, it can be made to do so by somehow indicating that height is the variable of importance, so please, dear baby, apply core-knowledge to the height variable as well. Well, from what we know from the previous posts, adult-infant interactions themselves carry an assumption about their relevance; it's entirely plausible that under the appropriate situation, the baby can be made to share the point of view of the adult. How?

SciFi experiment

If the experimenter shows surprise.. would the kid pick that up?

ACCIDENT!

Saw a serious-looking accident. The car turned quickly, and a youngster on a mobike rammed it head-on, the bike turned a double-somersualt over the car, the second one on the kids head! Bloody hell. Guess it's the first time I've seen anything of the sort... the cops even have my number as a witness!

Firefox blues

Firefox hasn't got decent scrolling for the fucking Mac. Naturally, I'm all for Open Source and stuff, so I must just wait in silence. I'm beginning to get pissed off about Macs. As soon as you have something that works, they switch stuff around. Now it's the whole bloody chip! I wish they'd at least make their fucking browser more in line with the whole mozilla stuff.

Jigsaw, Part 2: Aslin

Just got a mail from Gergo Csibra; he is indeed collaborating with Dan Sperber on the new stuff! No wonder his talk and Dan's talk had so much in common! Anyhow, this blog is to tie Dick Aslin in. [Reminder: The idea is to look at common grounds across the works of Sperber, Csibra, Aslin & Baillergeon] Starting from Saffran, Newport & Aslin, it's pretty much clear that even very young infants are able to extract statistical regularities from their input; noot just for speech but for pretty much anything. In his recent talk (link coming soon!) Dick talked about how such a computational system might be constrained. Since there are a very large number of statistical regularities in a given sequence, are there any constraints that limit the extraction of regularities? Dick showed evidence that this might be the case. However, me and others have a paper in press where we show that (adults) appear to have a constraint on statistical extraction of words: sequences that span prosodic boundaries are not considered as good word candidates. BUT! what we show in this paper is not that prosody blocks computing statistical regularities, but that prosody acts as a filter. This means that the statistical engine tries to extract word candidates based on their distributional properties, but the output of this system is weighed by other factors; in this case whether or not the word candidates are properly aligned with prosodic boundaries.

Constraints on statistical regularities: Updated

Here, then is the updated story of constraints on statistical learning. Remember the problem: there are lots of statistical regularities; only some are really useful or whatever. So how do you constrain the statistical-regularity-extraction engine? here is an updated answer, from Dick's work, our paper and other stuff randomly thrown in: 1) Statistics are preferentially computed over some units and not others. This was suggested by Bonatti et al for speech, and might be general. 2) The statistical engine is itself constrained: even with appropriate units, it computes statistics over only some tokens and not others. Dick showed this for tones: interleaved tone sequences in different octaves are perceptually streamed, and transition probabilities (TPs) are computed over the two streams independently. 3) The output of the statistical system is passed through other filters. This seems to be the case in adults in my experiments. 4) The statistical engine might have constraints on what models it chooses. This last point is not yet clear enough in my mind, but the general thrust is something like this: at least in Linguistics, we all know since Gold that inference is a nasty beast. One possibility suggested by some empirical work (help! who by?!) which shows (if I remember correctly) that at any point, human adults are predisposed to project the simplest hypothesis compatible with the evidence presented till then. What is not clear is if, there are multiple complex possibilities (statistical models), if there is a hardwired bias for some and not others. [[Sorry.. can't do better than that! Maybe there's nothing in this whole paragraph.]]

[[Csibra+Sperber] + Aslin]

So the updated jigsaw looks like this: What if there are constraints on statistical learning of stuff, which comes out of the inferences that we make based on social interactions in the Csibra+Sperber ideas? Methinks worthwhile exploring :)

Concept acquisition in Relevance Theory

[This is parenthetical to the previous] One thing I don't really get is how concepts are acquired in Rel-T. If I understand well, Sperber distinguishes (like many many others; most famously Jerry Fodor) concepts that can be expressed as lexical items (like DOG) and concepts that require phrases of some kind (like SMALL-YAPPY-TYPE-DOGS (cf Eddie Izzard, Definite Article)). In Rel-T, when someone says something to me, they use 'words', prosody, gestures and all kinds of shared social cues to convey something (call it SPKR-MESG) that is typically something like a phrasal concept, which I should concoct on the fly, given the evidence and my (supposed) inferential skills. BUT, this means that the only way to recover SPKR-MESG is that I already have (a) all the necessary base concepts and (b) all the rules of mentalese syntax. --> It seems pretty clear that there must be some basic concepts.. you cannot build phrasal concepts out of nothing; rules of mentalese syntax need something to rule over. --> Rule acquisition is notoriously thorny; the safest bet seems to be that the rules of mental syntax are hardwired as well.

The possibilities

1) the base concepts and mental syntax are hardwired 2) base concepts and mental syntax are ontogenic developments 3) a separate Concept Acquisition Device feeds new concepts into a common store.

Cognitive Jigsaw

Over the last month I heard four people, all of them 'famous'; and the strange thing was that all of them had something in common, in the way they described their views of certain aspects of Cognition. Let's see (in order of appearance) Dan Sperber: Relevance Theory. Renee Baillargeon: Physical Reasoning in Infancy. Richard Aslin: Constraints on Statistical Learning. Gergely Csibra: Pedagogy: A Human Specific Adaptation. Ok. Starting from the easiest links. Csibra's Pedagogy & Sperber's Relevance. Pedagogy: Human communication has some unique characteristics, it is: Ostensive- Makes the communication explicit; e.g. gaze holding ...1 Referential- Can be about something that is not commonly available (input-wise) to one or both of the interlocutors...2 Inferential- Makes assumptions about relevant context. ...3 Relevant context! ---> Sperber; Relevance theory; wherein the very act of communication is seen as an ostensive communicative cue; the assumption is that the listener believes that (in general), the communication will be meaningful in the given context. This is the Communicative Principle of Relevance (see, e.g., this paper)....4 Relevance theory in fact is like other Inferential Theories (like Gricean pragmatics), wherein: (quote from the paper above):

"... all a communicator has to do in order to convey a thought is to give her audience appropriate evidence of her intention to convey it. More generally, a mental state may be revealed by a behaviour (or by the trace a behaviour leaves in the environment). Behaviour capable of revealing the content of a mental state may also succeed in communicating this content to an audience. For this to happen, it must be used ostensively: that is, it must be displayed so as to make manifest an intention to inform the audience of this content." ...5

Which is pretty much the same thing as in 2 and 3 above. Coming Next: Aslin & Baillargeon

Brownian thought space

About Me

New York

Saturday, May 27, 2006