Skip to content

LkbWishlist

JohnCarroll edited this page Nov 21, 2017 · 19 revisions

This page is for listing wishlist items in the lkb. Just because they are here doesn't, of course, mean that anyone will implement them for you. In some cases, there has been relevant discussion on the developers mailing list (e.g., non-ASCII encodings).

  • Comments in TDL: it would aid grammar debugging if it were possible to comment out sections with a TDL defintion.

  • Head-Daughter shown in Trees: It would be nice if the graphical display marke which daughter was the head daughter in headed constructions, either by a thicker branch to it or maybe an arrow. An alternative would be to label the arcs (H for head, S for subject, etc) but that would probably be overdoing things).

  • Some treatment of Capitali(z|s)ation

  • Normalization of Numbers e.g. PLUS (CARG 20 CARG 3) => CARG 23

  • Robust Generation of Numbers CARG 23 => PLUS (CARG 20 CARG 3)

  • Redundancy Rule Check define a conventional syntax in the comments to suggest that a type is the combination of another type and a rule (or rules) and create a batch check to see if they are really the same.

    • e.g.
generic_adj_te_infl-lex := generic-i-adj-lex &
"combine: i-adj-stem-lex + adj-te-t-lexeme-c-stem-infl-rule; I am the generic type for te inflected adjectives, e.g. 美しく"
[RMORPH-BIND-TYPE t-morph,
 SYNSEM.LOCAL.CAT.HEAD i-adj_head & [MARK < [LOCAL.CAT.HEAD.H-TENSE te] > ],
 J-NEEDS-AFFIX +].
  • Proper support in the display/entry for non-ASCII encodings

    • entering text in the parse window
    • sentence display in the window-name
    • correct display in trees/feature structures
    • currently some support in Linux with Trollet, doesn't work in windows for some (most?) encodings

    These points are (mostly) implemented in LKB-FOS, although some work is required to pre-select the Unicode fonts to be used.

  • Support for multibyte encodings in the error messages: currently errors are given in byte position --- character position (or line number) would be more useful. Unfortunately the LKB code for reading type files is structured in a way that makes it difficult to keep track of character positions. The best approach would probably be to define a wrapper class for character input streams using the Gray streams API.

  • Linear precedence constraints: see LkbLpconstraints for a discussion document.

  • Control over font size in show-gen-output windows (for demo purposes).

  • Filter in the generator that blocks application of lex rules which add constraints to e.g., PNG which are incompatible with the input. (This should keep edges corresponding to verbs inflection for the "wrong" agreement from overpopulating the gen-chart.) This turned out to be particularly problematic in Zulu, which inflects for both objects and subjects and makes a ~18-way distinction in each case.

Clone this wiki locally