Leo Martins' Daily Rant

RSS

On Chomsky and the Two Cultures of Statistical Learning — Norvig

On Chomsky and the Two Cultures of Statistical Learning

I take Chomsky’s points to be the following:

  1. Statistical language models have had engineering success, but that is irrelevant to science.
  2. Accurately modeling linguistic facts is just butterfly collecting; what matters in science (and specifically linguistics) is the underlying principles.
  3. Statistical models are incomprehensible; they provide no insight. (…)

These are my answers:

  1. I agree that engineering success is not the goal or the measure of science. But I observe that science and engineering develop together, and that engineering success shows that something is working right, and so is evidence (but not proof) of a scientifically successful model.
  2. Science is a combination of gathering facts and making theories; neither can progress on its own. I think Chomsky is wrong to push the needle so far towards theory over facts; in the history of science, the laborious accumulation of facts is the dominant mode, not a novelty. The science of understanding language is no different than other sciences in this respect.
  3. I agree that it can be difficult to make sense of a model containing billions of parameters. Certainly a human can’t understand such a model by inspecting the values of each parameter individually. But one can gain insight by examing the properties of the model—where it succeeds and fails, how well it learns as a function of data, etc.

(via http://gosset.wharton.upenn.edu/~foster/rants/)