Brownian thought space

Cognitive science, mostly, but more a sometimes structured random walk about things.

My Photo
Location: Rochester, United States

Chronically curious モ..

Monday, June 12, 2006

Return of the Subset Principle

This was a talk by Theresa Biberauer & Ian Roberts (T&I) from the University of Cambridge at the DiGS meeting here in Trieste. I SO wish I'd attended more talks there! Anyhow; this was about the Subset Principle:
"the learner must guess the smallest possible language compatible with the input at each stage of the learning procedure" (Clarks & Roberts (1993) Linguistic Inquiry 24, 299-345)
The idea is that imagine a child is learning the relation between some variable x and some variable y. It is well known that the observed (x,y) pairs will vastly under-determine the possible underlying generating mechanisms. For example, {(0,0), (1,1)} is compatible with just about any function you choose. The subset principle idea is that a child should stay with the simplest grammar till (positive) evidence indicates the contrary. One problem that T&I raise is that, in many cases, it doesn't look like there are subset relations. For example, if you saw only "John(S) walks(V)", you would not know what the Verb-Object order was. However, the moment you saw "John(S) Mary(O) kisses(V)" or "John(S) kisses(V) Mary(O)", you would know it was OV (former) or VO (latter). I guess the bottom line the way I see it is, you cannot have binary parameters and hierarchical nesting of the languages. But, and this is kind of the point of the paper: if you think of the fact that certain parameters are logically necessary for other parameters to operate, then you CAN have nesting. Here's how: imagine (binary) parameters P1, P2, P3, P4. Now imagine that P2 to P4 are irrelevant if P1 is set to 'No'. Now, imagine that this is recursive: if P2 is set to 'No', P3-P4 are irrelevant. What this means is that if a learner assumes that P1 is set to 'No', it is left with a small subset of the possible languages. This is a subset: the full set includes all the languages that would have been specified if P1 were set to 'Yes'. So, learning happens when positive evidence is encountered that P1 is actually 'Yes'. This opens up the P1-Yes tree,so to say. Now the learner can assume that P2 is 'No', again it will consider only a subset of languages. And so forth. Here's the nice thing. Imagine that there are just a few parameters like P1: super-Parameters, if you want. You start off with the default setting of all these super-Parameters. Clearly the assumption is that these default settings imply that the sub-parameters are irrelevant given the default setting of the super-Parameter. Whenever you find positive evidence against it, you will suddenly open up the possibility that the sub-parameters are not irrelevant anymore, and will need to look only for that (positive) evidence that sets that sub-parameter. And so on. As far as I can tell (and I cannot tell very far, not having a carrying voice), this seems to be something like what that very nice man Pino Longobardi (University of Trieste) is saying. If it's exactly the same, apologies; it took me a while to get it :)


Post a Comment

Links to this post:

Create a Link

<< Home