Skip to content

Latest commit

 

History

History
9 lines (8 loc) · 337 Bytes

ideas.rst

File metadata and controls

9 lines (8 loc) · 337 Bytes

Some ideas

  • Remove "click here" links;
  • handle more punctuation symbols as features, not as tokens - this way there will be more token-level chains. Only comma is handled this way now;
  • introduce more features (they are quite basic now);
  • switch to an another CRF implementation (Wapiti is hard to use as a library).