Download Beyond grammar: an experience-based theory of language by Rens Bod PDF

By Rens Bod

Over the last few years, a brand new method of linguistic research has began to emerge. This method, which has grow to be identified lower than numerous labels resembling 'data-oriented parsing', 'corpus-based interpretation' and 'treebank grammar', assumes that human language comprehension and creation works with representations of concrete prior language stories instead of with summary grammatical principles. It operates via decomposing the given representations into fragments and recomposing these items to research (infinitely many) new utterances. This e-book exhibits how this normal technique can practice to varied sorts of linguistic representations. Experiments with this process recommend that the efficient devices of usual language can't be outlined by means of a minimum algorithm or rules, yet must be outlined by means of a wide, redundant set of formerly skilled constructions. Bod argues that this consequence has vital effects for linguistic concept, resulting in a completely new view of the character of linguistic competence.

Show description

Read Online or Download Beyond grammar: an experience-based theory of language PDF

Similar semantics books

Indefinites (Linguistic Inquiry Monographs)

Indefinites investigates the connection among the syntactic and semantic representations of sentences in the framework of generative grammar. It proposes a method of pertaining to government-binding concept, that's basically syntactic, to the semantic concept of noun word interpretation built by means of Kamp and Heim, and introduces a unique mapping set of rules that describes the relation among syntactic configurations and logical representations.

It's Been Said Before: A Guide to the Use and Abuse of Cliches

Cautious writers and audio system agree that clichés are regularly to be shunned. even if, the vast majority of us proceed to exploit them. Why do they persist in our language? In it has been acknowledged sooner than, lexicographer Orin Hargraves examines the bizarre concept and gear of the cliché. He is helping readers comprehend why convinced words turned clichés and why they need to be shunned -- or why they nonetheless have existence left in them.

Semantics: From meaning to text. Volume 3

This publication offers an cutting edge and novel method of linguistic semantics, ranging from the concept language will be defined as a mechanism for the expression of linguistic Meanings as specific floor varieties, or Texts. Semantics is particularly that approach of principles that guarantees a transition from a Semantic illustration of the which means of a family members of synonymous sentences to the Deep-Syntactic illustration of a specific sentence.

Additional resources for Beyond grammar: an experience-based theory of language

Example text

0 - Vp,)^)^, where the different values of i are indices corresponding to the different parses, 0 is the index of the most probable parse, pt is the probability of parse /, and N is the number of derivations that was sampled (cf Hammersley & Handscomb 1964, Demmg 1966) This upper bound on the probability of error becomes small if we increase N, but if there is an / with pl close to p0 (i e , if there are different parses in the top of the sampling distribution that are almost equally likely), we must make N very large to achieve this effect If there is no unique most probable parse, the sampling process will of course not converge on one outcome In that case, we are interested in all of the parses that outrank all the other ones But also when the probabilities of the most likely parses are very close together without being exactly equal, we may be interested not in the most probable parse, but in the set of all these almost equally highly probable parses This reflects the situation in which there is an ambiguity which cannot be resolved by probabilistic syntactic considerations We conclude, therefore, that the task of a syntactic disambiguation component is the calculation of the probability distribution of the various possible parses (and only in the case of a forced choice experiment we may choose the parse with the highest probability from this distribution) When we estimate this probability distribution by statistical methods, we must establish the reliability of this estimate This reliability is characterized by the probability of significant errors in the estimates of the probabilities of the various parses 48 / BEYOND GRAMMAR If a parse has probability pt, and we try to estimate the probability of this parse by its frequency in a sequence of N independent samples, the variance in the estimated probability is p,(l - pt)/N Since 0

Thus the only (proper) SCFG G' which is strongly equivalent with G consists of the following productions: S->Sb (1) (2) 3 This STSG is also interesting because it can be projected from a DOP1 model whose corpus of sentence-analyses consists only of tree t \. 2. An STSG consisting of three elementary trees G' is strongly stochastically equivalent with G iff it assigns the same probabilities to the parse trees in the tree language as assigned by G. e. 3_ 4 The tree represented by £3 has exactly one derivation, which consists of the elementary tree £3.

But the algorithms that were found either turned out to be still exponential (Sima'an et al. 2 Monte Carlo disambiguation: estimating the most probable parse by sampling random derivations Although there is no deterministic polynomial algorithm for finding the most probable parse, there may be an algorithm that estimates a most probable parse with an error that can be made arbitrarily small We now consider this possibility We have seen that a best-first search, as accomplished by Viterbi, can be used for finding the most probable derivation in STSG but not for finding the most probable parse If we apply instead of a best-first search, a random-first search, we can generate a random derivation from the derivation forest - provided that the random choices are based on the probabilities of the subdenvations By iteratively generating a large number of random derivations we can estimate the most probable parse as the parse which results most often from these random derivations (since the probability of a parse is the probability that any of its derivations occurs) The most probable parse can be estimated as accurately as desired by making the number of random samples sufficiently large According to the Law of Large Numbers, the most often generated parse converges to the most probable parse Methods that estimate the probability of an event by taking random samples are known as Monte Carlo methods (Meyer 1956, Hammersley & Handscomb 1964, Motwani & Raghavan 1995) The selection of a random derivation can be accomplished in a bottom-up fashion analogous to Viterbi Instead of selecting the most probable subdenvation at each node-sharing in the chart, a random subdenvation is selected at each node-sharing (in such a way that a subdenvation that has m times as large a probability as another subdenvation also has m times as large a chance to be chosen as this other subdenvation) Once arrived at the S-node, the random derivation of 46 / BEYOND GRAMMAR the whole sentence can be retrieved by tracing back the choices made at each node-sharing We may of course postpone sampling until the Snode, such that we sample directly from the distribution of all Sdenvations But this would take exponential time, since there may be exponentially many derivations for the whole sentence By sampling bottom-up at every node where ambiguity appears, the maximum number of different subdenvations at each node-sharing is bounded to a constant (the total number of rules of that node), and therefore the time complexity of generating a random derivation of an input sentence is equal to the time complexity of finding the most probable derivation, i e linear in grammar size G and cubic in sentence length n O(Gn ) This is exemplified by the following algorithm, which was originally published in Bod (1995b) Algorithm 1: Sampling a random derivation in O(Gn^) time Given a derivation forest of a sentence of n words, consisting of labeled entries (i,j) that span the words between the jth and the 7th position of the sentence Every entry is labeled with elementary trees, together with their probabilities, that define the last step in a set of subdenvations of the underlying subsentence Sampling a derivation from the chart consists of choosing at random one of the elementary trees for every root-node at every labeled entry (bottom-up, breadth-first)6 for length = 1 to n do for start = 0 to n - length do for chart-entry (start, start + length) do for each root node X do select at random an elementary tree with root node X, eliminate the other elementary trees with root node X, Let { (e\,p\) , (e2, P2) .

Download PDF sample

Rated 4.33 of 5 – based on 47 votes