6. As can be seen, the argument in favor of the
mar table constitutes a set of input data to the particu-
lar algorithm, in a similar way in which the sentences
separation of grammar and algorithm is considered far
from convincing. It does raise a related question, how-
to be parsed constitute input data. In this author’s
ever: If the major separation is not to be that between
opinion, this is again an oversimplification.
grammar and algorithm, what then are the major com-
First of all, it is to be noted that, in the view of
many programmers, only those data are considered in-
ponents of a parsing program?
The answer which this author has found satisfactory
put that are designed to be actually processed. Since
is the well-known one of structuring the parsing pro-
the grammar rules are not intended to be subject to
processing, but rather to constitute the parameters for
gram as an executive main routine with appropriate
processing, they are not input data in any way com-
subroutines. This raises the further question of the
parable to the sentences that are to be parsed.
functions and design of the executive routine and sub-
If, on the other hand, the question of processing is
routines.
to be ignored in deciding what is to be viewed as in-
In this type of parsing program, the function of the
put data, then another consideration must be taken
executive routine will be to determine what units to
into account. It is the following: the question as to
look for and where to look for them. The aim of the
what constitutes input can not be answered in the ab-
subroutines will be to provide the means for carrying
solute, but only relatively. That is, the question is not
out the necessary searches.
simply “Is it input?” but “What is it input to?” This
The design principle for such a parsing program
means that the answer depends, at least in part, on
will be the well-known one of functional subroutiniza-
what portions of the program are previously present
tion: the program will contain a set of self-contained
in the work space and what additional portions are in-
and interchangeable subroutines designed to perform
putted subsequently. In a bipartite program in which
individual functions.
The subroutines will be of two kinds: analytic sub-
the grammar is written into the algorithm, such as is
the case in the approach this author has taken, the
routines, the purpose of which will be to perform tasks
of linguistic analysis such as the determination of the
question of whether the grammar constitutes input
data can then be viewed as follows: while the gram-
internal structure and external functioning of the dif-
mar does not constitute a separate set of input data, it
ferent constructions that are to be recognized, and
nevertheless will use separate sets of grammatical in-
housekeeping subroutines, which are to insure that the
put data in the form of a grammar-coded dictionary
program is at all times aware of where it stands. The
latter means the following: the program has to know
that is fed into the program from a separate source.
what word it is dealing with; the program has to know
Likewise, it is possible to view the executive routine
at each step how far a given search is allowed to go
of the algorithm which contains the grammar as the
actual parsing algorithm and to view the remaining
and what points it is not allowed to go beyond; the
program has to be informed at all times of the neces-
portions as forms of input data.
sary location information, such as sentence boundaries,
Bạn đang xem 6. - BÁO CÁO KHOA HỌC SOME COMMENTS ON ALGORITHM AND GRAMMAR IN THE AUTOMATIC PARSING OF NATURAL LANGUAGES PPT