chickadee » link-grammar

Outdated egg!

This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.

If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.

Bindings for the CMU link-grammar parser system.

The link grammar parser is a syntactic parser of English, based on link grammar, an original theory of English syntax. Given a sentence the system assigns to it a syntactic structure, which consists of a set of labeled links connecting pairs of words. The parser also produces a 'constituent' representation of a sentence (showing noun phrases, verb phrases, etc.).

Author

David Ireland (djireland79 at gmail dot com)

Upstream

https://www.abisource.com/projects/link-grammar/

Egg Source Code

https://gitlab.com/maxwell79/chicken-link-grammar

[module] link-grammar

Documentation

Example Usage

(import scheme)
(cond-expand
 (chicken-4
   (use (prefix link-grammar lg:)))
 (chicken-5
   (import (prefix link-grammar lg:))))

(define (display-linkage sentence opts index)
  (let* ((links-found (lg:linkages-found sentence))
         (linkage (lg:create-linkage index sentence opts)))
    (when linkage
          (let ((constituents
                  (lg:get-constituents linkage lg:display-multi-line))
                (diagram (lg:get-diagram linkage #t 80)))
            (print constituents)
            (print diagram)
            (lg:delete-linkage! linkage)))
    (when (<= index links-found) (display-linkage sentence opts (+ index 1)))))
(define (parse text dictionary opts)
  (let* ((sentence (lg:create-sentence text dictionary))
         (num-linkages (lg:parse-sentence sentence opts)))
    (when (= num-linkages 0)
          (lg:set-min-null-count! opts 1)
          (lg:set-max-null-count! opts (lg:sentence-length sentence))
          (set! num-linkages (lg:parse-sentence sentence opts)))
    (display-linkage sentence opts 0)
    (lg:delete-sentence! sentence)))
(define dictionary (lg:create-default-dictionary))
(define opts (lg:init-opts))
(lg:set-linkage-limit! opts 1000)
(lg:set-short-length! opts 10)
(lg:set-verbosity! opts 1)
(lg:set-max-parse-time! opts 30)
(lg:set-linkage-limit! opts 1000)
(lg:set-min-null-count! opts 0)
(lg:set-max-null-count! opts 0)
(lg:set-short-length! opts 16)
(lg:set-islands-ok! opts #f)
(parse "The black fox ran from the hunters" dictionary opts)
(lg:delete-parse-options! opts)
(lg:delete-dictionary! dictionary)
 (S (NP the black.a fox.n)
            (VP ran.v-d
                (PP from
                    (NP the hunters.n))))
 +------------------------Xp------------------------+       
 +----------->WV----------->+                       |       
 +---------Wd--------+      |                       |       
 |      +----Ds**x---+      |      +----Jp----+     |       
 |      |     +---A--+--Ss--+--MVp-+   +--Dmc-+     +--RW--+
 |      |     |      |      |      |   |      |     |      |
 LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL
 (S (NP the black.a fox.n)
    (VP ran.v-d
        (PP from
            (NP the hunters.n))))
 +------------------------Xp------------------------+       
 +---------Wd--------+                              |       
 |      +----Ds**x---+             +----Jp----+     |       
 |      |     +---A--+--Ss--+--MVp-+   +--Dmc-+     +--RW--+
 |      |     |      |      |      |   |      |     |      |
 LEFT-WALL the black.a fox.n ran.v-d from the hunters.n . RIGHT-WALL

Simple Use

Parse a text using default values for the dictionary and parser

parse-with-default

parse-with-default textprocedure

Parse text using default values

text
string to parse

Sentences

A sentence is the API's representation of an input string, tokenized and interpreted according to a specific Dictionary. After a Sentence is created and parsed, various attributes of the resulting set of linkages can be obtained.

create-sentence

create-sentence input dictionaryprocedure

creates a sentence object from the input string, using the Dictionary that was created earlier to tokenize and define words

input
Input string (string)
dictionary
dictionary to use

delete-sentence!

delete-sentence! sentenceprocedure

Deletes the specificed sentence

sentence
Sentence to be deleted (sentence)

split-sentence

split-sentence sentence parse-optionsprocedure

Splits (tokenizes) the sentence up into its component words and punctuation. This includes splitting up certain run-on expressions, such as '12ft.' which is split into '12' and 'ft.'. If spell- guessing is enabled in the opts, the tokenizer will also separate most run-on words, i.e. pairs of words without an intervening space. This routine returns zero if successful; else a non-zero value if an error occurred.

sentence
Sentence to split (sentence)
parse-options

parse-sentence

parse-sentence sentence parse-optionsprocedure

This routine represents the heart of the program. There are several things that are done when a sentence is parsed: 1. Word expressions are extracted from the dictionary and pruned. 2. Disjuncts are built. 3. A series of pruning operations is carried out. 4. The linkages having the minimal number of null links are counted. 5. A 'parse set' of linkages is built. 6. The linkages are post-processed.

The 'parse set' is attached to the sentence, and this is one of the key reasons that the API is flexible and modular. All of the necessary information for building linkages is stored in the parse set. This means that other sentences can be parsed, possibly using different dictionaries and other parameters, without disturbing the information obtained from a call to sentence_parse. If another call to parse-sentence is made on the same sentence, the parsing information for the previous call is deleted. Like almost all of the other routines, this call is thread-safe: that is, sentences can be parsed concurrently in multiple threads.

sentence
parse-options

sentence-length

sentence-length sentenceprocedure

Returns the length of the sentence

sentence

sentence-null-count

sentence-null-countprocedure

Returns the number of words that failed to be linked into the rest of the sentence during parsing. This number is greater then zero whenever a word doesn't seem to fit anywhere in the parse, either due to poor grammar, or due to a shortcoming of the dictionary.

linkages-found

linkages-foundprocedure

Returns the number of linkages that the search found

valid-linkages

valid-linkagesprocedure

Returns the number of linkages that had no post-processing violations

linkages-post-processed

linkages-post-processedprocedure

Returns the number of linkages that were actually post-processed

linkages-violated

linkages-violatedprocedure

Returns the number of post-processing violations that the i-th linkage had

during the last call to sentence_parse.

sentence-disjunct-cost

sentence-disjunct-cost sentence indexprocedure

Returns the sum total of all of the costs of all of the disjuncts used in the i-th linkage of the sentence. The higher the cost, the less likely that the parse is correct. Very roughly, this can be interpreted as if it was (minus) the log-liklihood of a parse being correct.

sentence
index

Returns the sum of the length of the links in the i-th parse. The ratio of this length, to the total length of the sentence, gives a rough measure of the complexity of the sentence. That is, long-range links between distant words indicates that the sentence may be hard to understand; alternately, it may indicate that the parse is not very accurate.

sentence
index

Dictionary

A Dictionary is the programmer's handle on the set of word definitions that defines the grammar. A user creates a Dictionary from a grammar file and post-process knowledge file, and then passes it to the various parsing routines.

create-dictionary-with-language

create-dictionary-with-language languageprocedure

Creates a dictionary with the specified language

language
Language to use (string)

create-default-dictionary

create-default-dictionaryprocedure

Looks for a dictionary in the same language as the current environment, and if one is found, creates a dictionary object.

get-dictionary-language

get-dictionary-language dictionaryprocedure

Returns the language of the specified dictionary

dictionary
specified dictionary (dictionary)

delete-dictionary!

delete-dictionary! dictionaryprocedure

Deletes the specified dictionary

dictionary
specified dictionary (dictionary)

set-dictionary-data-dir!

set-dictionary-data-dir! pathprocedure

Specify the file path to the dictionaries to use; to be effective, this routine must be called before the dictionaries are opened.

path
Filename with path

get-dictionary-data-dir

get-dictionary-data-dirprocedure

Returns the file path to the dictionaries

Linkages

create-linkage

create-linkageprocedure

This function creates the index-th linkage from the (parsed) sentence sent. Several operations can be carried out on the resulting linkage; for example it can be printed, post-processed with a different post- processor, or information on individual links can be extracted. If the parse has a conjunction, then the linkage will be made up of two or more sublinkages.

delete-linkage!

delete-linkage! linakgeprocedure

Delete the given linkage

linakge

num-words

num-words linkageprocedure

The number of words in the sentence for which this is a linkage.

linkage

The number of links used in the linkage.

linkage

The value returned by num-links procedure is the number of words spanned by the index-th link of the linkage.

linkage
index
(number)

get-lword

get-lwordprocedure

The value returned is the number of the word on the left end of the index-th link of the current sublinkage.

get-rword

get-rwordprocedure

The value returned is the number of the word on the right end of the index-th link of the current sublinkage.

The label on a link in a diagram is constructed by taking the 'intersection' of the left and right connectors that comprise the link. For example, 'I.p eat, therefore I.p think.v' has a Sp*i label on the link between the words I.p and eat is constructed from the Sp*i connector on the its left word, and the Sp connector on its right word. So, for this example, both link-label and link-llabel return 'Sp*i' while link-rlabel returns 'Sp' for this link.

linkage
index

See link-label

linkage
index

See link-label

linkage
index

num-domains

num-domains linkage indexprocedure

num-domains, link-domain-names allow access to most of the domain structure extracted during post-processing. The index parameter in the first two calls specify which link in the linkage to extract the information for. In the 'I eat therefore I think' example above, the link between the words therefore and I.p belongs to two 'm' domains. If the linkage violated any post-processing rules, the name of the violated rule in the post-process knowledge file can be determined by a call to get-violation-name.

linkage
index

Gets domain structure extracted during the post-processing

linkage
word-index
Specifies which link in the linkage to extract the information for.

get-words

get-words linkageprocedure

Returns the array of word spellings or individual word spelling for the linkage. These are the subscripted spellings, such as 'dog.n'. The original spellings can be obtained by calls to sentence-get-word.

linkage

get-word

get-word linkage word-numberprocedure

Returns the word spelling of an individual word

linkage
word-number
The specific word

disjunct-str

disjunct-str linkage linkage word-numberprocedure

Return a string showing the disjuncts that were actually used in association with the specified word in the current linkage. The string shows the disjuncts in proper order; that is, left-to-right, in the order in which they link to other words. The returned string can be thought of as a very precise part-of-speech-like label for the word, indicating how it was used in the given sentence; this can be useful for corpus statistics.

linkage
The specific linkage
linkage
word-number
The specific word

disjunct-cost

disjunct-costprocedure

Return the cost of a word as used in a particular linkage, based

    on the dictionary.

disjunct-corpus-score

disjunct-corpus-scoreprocedure

Returns the cost based on the corpus-statistics database.

get-constituents

get-constituents linkage display-styleprocedure

Returns the constituents for a particular linkage

linkage
display-style
(number

get-diagram

get-diagram linkage display-walls? screen-widthprocedure

Returns the linkage diagram

linkage
display-walls?
A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
screen-width
The screen-width is an integer, indicating the number of columns that should be used during printing; long sentences that are wider than the number of columns will be automatically wrapped so that they always fit.

get-postscript

get-postscript linkage display-walls? print-ps-header?procedure

Returns the macros needed to print out the linkage in a postscript file.

linkage
display-walls?
A boolean that indicates whether or not the wall-words, and the connectors to them, should be printed
print-ps-header?
A boolean that indicates whether or not postscript header boilerplate should be included.

get-disjuncts

get-disjuncts linkageprocedure

Returns the returns a string that shows all of the disjuncts, and their costs, that were used to create the linkage.

linkage

Returns a string that lists all of the links and domain names for the linkage.

linkage

unused-word-cost

unused-word-cost linkageprocedure

Should return the same value as sentence-null-count.

linkage

disjunct-cost

disjunct-cost linkageprocedure

Should return the same value as sentence-disjunct-cost.

linkage

Should return the same value as sentence-link-cost.

linkage

corpus-cost

corpus-cost linkageprocedure

Returns the total cost of this particular linkage, based on the cost of disjuncts stored in the corpus-statistics database.

linkage

linkage->eps-file

linkage->eps-file filename postscriptprocedure

Saves a linkage to a postscript file

path
filename
postscript
Postscript string

get-version

get-versionprocedure

Gets link-grammar version

get-dictionary-version

get-dictionary-version dictionaryprocedure

Gets dictionary version

dictionary
Dictionary

get-dictionary-locale

get-dictionary-localeprocedure

Gets dictionary locale

display-off

display-offconstant

Turn off display

display-multi-line

display-multi-lineconstant

Print diagram across multiple lines

display-bracket-tree

display-bracket-treeconstant

Use brackets when printing diagram

display-single-line

display-single-lineconstant

Print diagram on single line

display-max-styles

display-max-stylesconstant

Print diagram on single line

set-display-morphology!

set-display-morphology! parse-options valueprocedure

Sets display morphology in parse-options

parse-options
value
(number)

get-display-morphology

get-display-morphology parse-optionsprocedure

Gets display morphology value

parse-options

Parse Options

Parse-options specify the different parameters that are used to parse sentences. Examples of the kinds of things that are controlled by parse-options include maximum parsing time and memory, whether to use null-links, and whether or not to use 'panic' mode. This data structure is passed in to the various parsing and printing routines along with the sentence.

Default value for parse-option members are:

verbosity → 0

linkage-limit → 10000

min-null-count → 0

max-null-count → 0

null-block → 1

islands-ok → #f

short-length → 6

all-short → #f

display-short → #t

display-word-subscripts → #t

display-link-subscripts → #t

display-walls → #f

allow-null → #t

echo-on → #f

batch-mode → #f

panic-mode → #f

screen-width → 79

display-on → #t

display-postscript → #f

display-bad → #f

display-links → #f

init-opts

init-optsprocedure

Initilise parse-options to default values

set-max-parse-time!

set-max-parse-time! parse-options valueprocedure

Set maximum parse time

parse-options
value
(number)

set-linkage-limit!

set-linkage-limit! parse-options linkage-limitprocedure

Set linkage limit

parse-options
linkage-limit
(number)

set-short-length!

set-short-length! parse-options short-lengthprocedure

The short_length parameter determines how long the links are allowed to be. The intended use of this is to speed up parsing by not considering very long links for most connectors, since they are very rarely used in a correct parse. An entry for UNLIMITED-CONNECTORS in the dictionary will specify which connectors are exempt from the length limit.

parse-options
short-length
(number)

set-disjunct-cost!

set-disjunct-cost! parse-options disjunt-costprocedure

Determines the maximum disjunct cost used during parsing, where the cost of a disjunct is equal to the maximum cost of all of its connectors. The default is that only disjuncts up to a cost of 2.9 are considered.

parse-options
disjunt-cost

set-min-null-count!

set-min-null-count! parse-options null-countprocedure

When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min_null_count and max_null_count. By default, the minimum and maximum number of null links is 0, so null links are not used.

parse-options
null-count

set-max-null-count!

set-max-null-count! parse-options null-countprocedure

When parsing a sentence, the parser will find all solutions having the minimum number of null links. It carries out its search in the range of null link counts between min-null-count and max-null-count. By default, the minimum and maximum number of null links is 0, so null links are not used.

parse-options
null-count

reset-resources!

reset-resources! parse-optionsprocedure

Reset acquired resources

parse-options

resources-exhausted?

resources-exhausted? parse-optionsprocedure

Resources_exhausted means memory-exhausted? OR timer-expired?

parse-options

memory-exhausted?

memory-exhausted? parse-optionsprocedure

Checks whether the memory was exhausted during parsing

parse-options

timer-expired?

timer-expired? parse-optionsprocedure

Checks whether the timer was exceeded during parsing.

parse-options

set-islands-ok!

set-islands-ok! parse-options islands-ok?procedure

This option determines whether or not 'islands' of links are allowed.

parse-options
islands-ok?
A boolean to indicate whether islands are allowed

set-verbosity!

set-verbosity! parse-options verbosity-levelprocedure

Sets/gets the level of description printed to stderr/stdout about the parsing process.

parse-options
verbosity-level

get-verbosity

get-verbosity parse-optionsprocedure

Get the verbosity level

parse-options

delete-parse-options!

delete-parse-options! parse-optionsprocedure

Delete a parse-option object

parse-options

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

About this egg

Author

David Ireland

Repository

https://gitlab.com/maxwell79/chicken-link-grammar

License

LGPL-2.1

Dependencies

Versions

1.6

Colophon

Documented by hahn.

Contents »