Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
link-grammar
Bindings for the CMU link-grammar parser system.
TOC »
- Outdated egg!
- link-grammar
- Link Grammar
- Author
- Upstream
- Egg Source Code
- Example Usage
- Simple Use
- Sentences
- Dictionary
- Linkages
- create-linkage
- delete-linkage!
- num-words
- num-links
- link-length
- get-lword
- get-rword
- link-label
- link-llabel
- link-rlabel
- num-domains
- link-domain-names
- get-words
- get-word
- disjunct-str
- disjunct-cost
- disjunct-corpus-score
- get-constituents
- get-diagram
- get-postscript
- get-disjuncts
- get-links-domains
- unused-word-cost
- disjunct-cost
- link-cost
- corpus-cost
- linkage->eps-file
- get-version
- get-dictionary-version
- get-dictionary-locale
- display-off
- display-multi-line
- display-bracket-tree
- display-single-line
- display-max-styles
- set-display-morphology!
- get-display-morphology
- Parse Options
- License
- About this egg
Link Grammar
The link grammar parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a 'constituent' representation of a sentence (showing noun phrases, verb phrases, etc.).
Author
David Ireland (djireland79 at gmail dot com)
Upstream
https://www.abisource.com/projects/link-grammar/
Egg Source Code
https://gitlab.com/maxwell79/chicken-link-grammar
link-grammar
[module] link-grammar
Documentation
- parse-with-default
- parse-sentence
- display-off
- display-multi-line
- display-bracket-tree
- display-single-line
- display-max-styles
- create-default-dictionary
- create-dictionary-with-language
- get-verbosity
- get-version
- get-dictionary-version
- get-dictionary-locale
- get-dictionary-language
- get-dictionary-data-dir
- set-dictionary-data-dir!
- delete-dictionary!
- create-sentence
- split-sentence
- sentence-length
- sentence-null-count
- sentence-disjunct-cost
- sentence-link-cost
- linkages-found
- linkages-post-processed
- linkages-violated
- valid-linkages
- delete-sentence!
- create-linkage
- corpus-cost
- get-lword
- get-rword
- get-words
- get-word
- get-constituents
- get-diagram
- get-postscript
- get-disjuncts
- get-links-domains
- get-violation-name
- link-length
- link-label
- link-llabel
- link-rlabel
- link-cost
- link-domain-names
- num-words
- num-links
- num-domains
- unused-word-cost
- delete-linkage!
- init-opts
- set-max-parse-time!
- set-linkage-limit!
- set-short-length!
- set-disjunct-cost!
- set-min-null-count!
- set-max-null-count!
- set-max-parse-time!
- set-islands-ok!
- set-verbosity!
- resources-exhausted?
- memory-exhausted?
- timer-expired?
- reset-resources!
- delete-parse-options!
Example Usage
(import scheme) (cond-expand (chicken-4 (use (prefix link-grammar lg:))) (chicken-5 (import (prefix link-grammar lg:)))) (define (display-linkage sentence opts index) (let* ((links-found (lg:linkages-found sentence)) (linkage (lg:create-linkage index sentence opts))) (when linkage (let ((constituents (lg:get-constituents linkage lg:display-multi-line)) (diagram (lg:get-diagram linkage #t 80))) (print constituents) (print diagram) (lg:delete-linkage! linkage))) (when (<= index links-found) (display-linkage sentence opts (+ index 1))))) (define (parse text dictionary opts) (let* ((sentence (lg:create-sentence text dictionary)) (num-linkages (lg:parse-sentence sentence opts))) (when (= num-linkages 0) (lg:set-min-null-count! opts 1) (lg:set-max-null-count! opts (lg:sentence-length sentence)) (set! num-linkages (lg:parse-sentence sentence opts))) (display-linkage sentence opts 0) (lg:delete-sentence! sentence))) (define dictionary (lg:create-default-dictionary)) (define opts (lg:init-opts)) (lg:set-linkage-limit! opts 1000) (lg:set-short-length! opts 10) (lg:set-verbosity! opts 1) (lg:set-max-parse-time! opts 30) (lg:set-linkage-limit! opts 1000) (lg:set-min-null-count! opts 0) (lg:set-max-null-count! opts 0) (lg:set-short-length! opts 16) (lg:set-islands-ok! opts #f) (parse "The black fox ran from the hunters" dictionary opts) (lg:delete-parse-options! opts) (lg:delete-dictionary! dictionary)
(S (NP the black.a fox.n) (VP ran.v-d (PP from (NP the hunters.n))))
+------------------------Xp------------------------+ +----------->WV----------->+ | +---------Wd--------+ | | | +----Ds**x---+ | +----Jp----+ | | | +---A--+--Ss--+--MVp-+ +--Dmc-+ +--RW--+ | | | | | | | | | | LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
(S (NP the black.a fox.n) (VP ran.v-d (PP from (NP the hunters.n))))
+------------------------Xp------------------------+ +---------Wd--------+ | | +----Ds**x---+ +----Jp----+ | | | +---A--+--Ss--+--MVp-+ +--Dmc-+ +--RW--+ | | | | | | | | | | LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
Simple Use
Parse a text using default values for the dictionary and parser
parse-with-default
- parse-with-default textprocedure
Parse text using default values
- text
- string to parse
Sentences
A sentence is the API's representation of an input string, tokenized and interpreted according to a specific Dictionary. After a Sentence is created and parsed, various attributes of the resulting set of linkages can be obtained.
create-sentence
- create-sentence input dictionaryprocedure
creates a sentence object from the input string, using the Dictionary that was created earlier to tokenize and define words
- input
- Input string (string)
- dictionary
- dictionary to use
delete-sentence!
- delete-sentence! sentenceprocedure
Deletes the specificed sentence
- sentence
- Sentence to be deleted (sentence)
split-sentence
- split-sentence sentence parse-optionsprocedure
Splits (tokenizes) the sentence up into its component words and punctuation. This includes splitting up certain run-on expressions, such as '12ft.' which is split into '12' and 'ft.'. If spell- guessing is enabled in the opts, the tokenizer will also separate most run-on words, i.e. pairs of words without an intervening space. This routine returns zero if successful; else a non-zero value if an error occurred.
- sentence
- Sentence to split (sentence)
- parse-options
parse-sentence
- parse-sentence sentence parse-optionsprocedure
This routine represents the heart of the program. There are several things that are done when a sentence is parsed: 1. Word expressions are extracted from the dictionary and pruned. 2. Disjuncts are built. 3. A series of pruning operations is carried out. 4. The linkages having the minimal number of null links are counted. 5. A 'parse set' of linkages is built. 6. The linkages are post-processed.
The 'parse set' is attached to the sentence, and this is one of the key reasons that the API is flexible and modular. All of the necessary information for building linkages is stored in the parse set. This means that other sentences can be parsed, possibly using different dictionaries and other parameters, without disturbing the information obtained from a call to sentence_parse. If another call to parse-sentence is made on the same sentence, the parsing information for the previous call is deleted. Like almost all of the other routines, this call is thread-safe: that is, sentences can be parsed concurrently in multiple threads.
- sentence
- parse-options
sentence-length
- sentence-length sentenceprocedure
Returns the length of the sentence
- sentence
sentence-null-count
- sentence-null-countprocedure
Returns the number of words that failed to be linked into the rest of the sentence during parsing. This number is greater then zero whenever a word doesn't seem to fit anywhere in the parse, either due to poor grammar, or due to a shortcoming of the dictionary.
linkages-found
- linkages-foundprocedure
Returns the number of linkages that the search found
valid-linkages
- valid-linkagesprocedure
Returns the number of linkages that had no post-processing violations
linkages-post-processed
- linkages-post-processedprocedure
Returns the number of linkages that were actually post-processed
linkages-violated
- linkages-violatedprocedure
Returns the number of post-processing violations that the i-th linkage had
during the last call to sentence_parse.
sentence-disjunct-cost
- sentence-disjunct-cost sentence indexprocedure
Returns the sum total of all of the costs of all of the disjuncts used in the i-th linkage of the sentence. The higher the cost, the less likely that the parse is correct. Very roughly, this can be interpreted as if it was (minus) the log-liklihood of a parse being correct.
- sentence
- index
sentence-link-cost
- sentence-link-cost sentence indexprocedure
Returns the sum of the length of the links in the i-th parse. The ratio of this length, to the total length of the sentence, gives a rough measure of the complexity of the sentence. That is, long-range links between distant words indicates that the sentence may be hard to understand; alternately, it may indicate that the parse is not very accurate.
- sentence
- index
Dictionary
A Dictionary is the programmer's handle on the set of word definitions that defines the grammar. A user creates a Dictionary from a grammar file and post-process knowledge file, and then passes it to the various parsing routines.
create-dictionary-with-language
- create-dictionary-with-language languageprocedure
Creates a dictionary with the specified language
- language
- Language to use (string)
create-default-dictionary
- create-default-dictionaryprocedure
Looks for a dictionary in the same language as the current environment, and if one is found, creates a dictionary object.
get-dictionary-language
- get-dictionary-language dictionaryprocedure
Returns the language of the specified dictionary
- dictionary
- specified dictionary (dictionary)
delete-dictionary!
- delete-dictionary! dictionaryprocedure
Deletes the specified dictionary
- dictionary
- specified dictionary (dictionary)
set-dictionary-data-dir!
- set-dictionary-data-dir! pathprocedure
Specify the file path to the dictionaries to use; to be effective, this routine must be called before the dictionaries are opened.
- path
- Filename with path
get-dictionary-data-dir
- get-dictionary-data-dirprocedure
Returns the file path to the dictionaries
Linkages
create-linkage
- create-linkageprocedure
This function creates the index-th linkage from the (parsed) sentence sent. Several operations can be carried out on the resulting linkage; for example it can be printed, post-processed with a different post- processor, or information on individual links can be extracted. If the parse has a conjunction, then the linkage will be made up of two or more sublinkages.
delete-linkage!
- delete-linkage! linakgeprocedure
Delete the given linkage
- linakge
num-words
- num-words linkageprocedure
The number of words in the sentence for which this is a linkage.
- linkage
num-links
- num-links linkageprocedure
The number of links used in the linkage.
- linkage
link-length
- link-length linkage indexprocedure
The value returned by num-links procedure is the number of words spanned by the index-th link of the linkage.
- linkage
- index
- (number)
get-lword
- get-lwordprocedure
The value returned is the number of the word on the left end of the index-th link of the current sublinkage.
get-rword
- get-rwordprocedure
The value returned is the number of the word on the right end of the index-th link of the current sublinkage.
link-label
- link-label linkage indexprocedure
The label on a link in a diagram is constructed by taking the 'intersection' of the left and right connectors that comprise the link. For example, 'I.p eat, therefore I.p think.v' has a Sp*i label on the link between the words I.p and eat is constructed from the Sp*i connector on the its left word, and the Sp connector on its right word. So, for this example, both link-label and link-llabel return 'Sp*i' while link-rlabel returns 'Sp' for this link.
- linkage
- index
link-llabel
- link-llabel linkage indexprocedure
See link-label
- linkage
- index
link-rlabel
- link-rlabel linkage indexprocedure
See link-label
- linkage
- index
num-domains
- num-domains linkage indexprocedure
num-domains, link-domain-names allow access to most of the domain structure extracted during post-processing. The index parameter in the first two calls specify which link in the linkage to extract the information for. In the 'I eat therefore I think' example above, the link between the words therefore and I.p belongs to two 'm' domains. If the linkage violated any post-processing rules, the name of the violated rule in the post-process knowledge file can be determined by a call to get-violation-name.
- linkage
- index
link-domain-names
- link-domain-names linkage word-indexprocedure
Gets domain structure extracted during the post-processing
- linkage
- word-index
- Specifies which link in the linkage to extract the information for.
get-words
- get-words linkageprocedure
Returns the array of word spellings or individual word spelling for the linkage. These are the subscripted spellings, such as 'dog.n'. The original spellings can be obtained by calls to sentence-get-word.
- linkage
get-word
- get-word linkage word-numberprocedure
Returns the word spelling of an individual word
- linkage
- word-number
- The specific word
disjunct-str
- disjunct-str linkage linkage word-numberprocedure
Return a string showing the disjuncts that were actually used in association with the specified word in the current linkage. The string shows the disjuncts in proper order; that is, left-to-right, in the order in which they link to other words. The returned string can be thought of as a very precise part-of-speech-like label for the word, indicating how it was used in the given sentence; this can be useful for corpus statistics.
- linkage
- The specific linkage
- linkage
- word-number
- The specific word
disjunct-cost
- disjunct-costprocedure
Return the cost of a word as used in a particular linkage, based
on the dictionary.
disjunct-corpus-score
- disjunct-corpus-scoreprocedure
Returns the cost based on the corpus-statistics database.
get-constituents
- get-constituents linkage display-styleprocedure
Returns the constituents for a particular linkage
- linkage
- display-style
- (number
get-diagram
- get-diagram linkage display-walls? screen-widthprocedure
Returns the linkage diagram
- linkage
- display-walls?
- A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
- screen-width
- The screen-width is an integer, indicating the number of columns that should be used during printing; long sentences that are wider than the number of columns will be automatically wrapped so that they always fit.
get-postscript
- get-postscript linkage display-walls? print-ps-header?procedure
Returns the macros needed to print out the linkage in a postscript file.
- linkage
- display-walls?
- A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
- print-ps-header?
- A boolean that indicates whether or not postscript header boilerplate should be included.
get-disjuncts
- get-disjuncts linkageprocedure
Returns the returns a string that shows all of the disjuncts, and their costs, that were used to create the linkage.
- linkage
get-links-domains
- get-links-domains linkageprocedure
Returns a string that lists all of the links and domain names for the linkage.
- linkage
unused-word-cost
- unused-word-cost linkageprocedure
Should return the same value as sentence-null-count.
- linkage
disjunct-cost
- disjunct-cost linkageprocedure
Should return the same value as sentence-disjunct-cost.
- linkage
link-cost
- link-cost linkageprocedure
Should return the same value as sentence-link-cost.
- linkage
corpus-cost
- corpus-cost linkageprocedure
Returns the total cost of this particular linkage, based on the cost of disjuncts stored in the corpus-statistics database.
- linkage
linkage->eps-file
- linkage->eps-file filename postscriptprocedure
Saves a linkage to a postscript file
- path
- filename
- postscript
- Postscript string
get-version
- get-versionprocedure
Gets link-grammar version
get-dictionary-version
- get-dictionary-version dictionaryprocedure
Gets dictionary version
- dictionary
- Dictionary
get-dictionary-locale
- get-dictionary-localeprocedure
Gets dictionary locale
display-off
- display-offconstant
Turn off display
display-multi-line
- display-multi-lineconstant
Print diagram across multiple lines
display-bracket-tree
- display-bracket-treeconstant
Use brackets when printing diagram
display-single-line
- display-single-lineconstant
Print diagram on single line
display-max-styles
- display-max-stylesconstant
Print diagram on single line
set-display-morphology!
- set-display-morphology! parse-options valueprocedure
Sets display morphology in parse-options
- parse-options
- value
- (number)
get-display-morphology
- get-display-morphology parse-optionsprocedure
Gets display morphology value
- parse-options
Parse Options
Parse-options specify the different parameters that are used to parse sentences. Examples of the kinds of things that are controlled by parse-options include maximum parsing time and memory, whether to use null-links, and whether or not to use 'panic' mode. This data structure is passed in to the various parsing and printing routines along with the sentence.
Default value for parse-option members are:
verbosity → 0
linkage-limit → 10000
min-null-count → 0
max-null-count → 0
null-block → 1
islands-ok → #f
short-length → 6
all-short → #f
display-short → #t
display-word-subscripts → #t
display-link-subscripts → #t
display-walls → #f
allow-null → #t
echo-on → #f
batch-mode → #f
panic-mode → #f
screen-width → 79
display-on → #t
display-postscript → #f
display-bad → #f
display-links → #f
init-opts
- init-optsprocedure
Initilise parse-options to default values
set-max-parse-time!
- set-max-parse-time! parse-options valueprocedure
Set maximum parse time
- parse-options
- value
- (number)
set-linkage-limit!
- set-linkage-limit! parse-options linkage-limitprocedure
Set linkage limit
- parse-options
- linkage-limit
- (number)
set-short-length!
- set-short-length! parse-options short-lengthprocedure
The short_length parameter determines how long the links are allowed to be. The intended use of this is to speed up parsing by not considering very long links for most connectors, since they are very rarely used in a correct parse. An entry for UNLIMITED-CONNECTORS in the dictionary will specify which connectors are exempt from the length limit.
- parse-options
- short-length
- (number)
set-disjunct-cost!
- set-disjunct-cost! parse-options disjunt-costprocedure
Determines the maximum disjunct cost used during parsing, where the cost of a disjunct is equal to the maximum cost of all of its connectors. The default is that only disjuncts up to a cost of 2.9 are considered.
- parse-options
- disjunt-cost
set-min-null-count!
- set-min-null-count! parse-options null-countprocedure
When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min_null_count and max_null_count. By default, the minimum and maximum number of null links is 0, so null links are not used.
- parse-options
- null-count
set-max-null-count!
- set-max-null-count! parse-options null-countprocedure
When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min-null-count and max-null-count. By default, the minimum and maximum number of null links is 0, so null links are not used.
- parse-options
- null-count
reset-resources!
- reset-resources! parse-optionsprocedure
Reset acquired resources
- parse-options
resources-exhausted?
- resources-exhausted? parse-optionsprocedure
Resources_exhausted means memory-exhausted? OR timer-expired?
- parse-options
memory-exhausted?
- memory-exhausted? parse-optionsprocedure
Checks whether the memory was exhausted during parsing
- parse-options
timer-expired?
- timer-expired? parse-optionsprocedure
Checks whether the timer was exceeded during parsing.
- parse-options
set-islands-ok!
- set-islands-ok! parse-options islands-ok?procedure
This option determines whether or not 'islands' of links are allowed.
- parse-options
- islands-ok?
- A boolean to indicate whether islands are allowed
set-verbosity!
- set-verbosity! parse-options verbosity-levelprocedure
Sets/gets the level of description printed to stderr/stdout about the parsing process.
- parse-options
- verbosity-level
get-verbosity
- get-verbosity parse-optionsprocedure
Get the verbosity level
- parse-options
delete-parse-options!
- delete-parse-options! parse-optionsprocedure
Delete a parse-option object
- parse-options
License
This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
About this egg
Author
Repository
https://gitlab.com/maxwell79/chicken-link-grammar
License
LGPL-2.1
Dependencies
Versions
Colophon
Documented by hahn.