Java implementation of the Aho-Corasick efficient string matching state machine
Mensa is a generic, flexible, enhanced, and efficient free software Java implementation of a pattern matching state machine as described by the 1975 paper by Alfred V. Aho and Margaret J. Corasick: _Efficient string matching: An aid to bibliographic search_. This implementation is
- generic in that it can be used to match any type of symbols as defined by the Java template type S — e.g., it is possible to create a machine to match bytes, characters, integers, gene sequences, bit sequences, etc.;
- flexible in that the architecture allows for granular extension, customization, or replacement of framework components;
- enhanced in that it supports a number of useful extension not addressed in the original paper, such as whole-word matching, case-sensitivity controls, fuzzy whitespace matching, fuzzy punctuation matching, incremental matching (i.e., iterators), matching event listeners, etc.; and
- efficient in that it performs well in terms of both time and resource usages on very large (~million term) keyword sets.
16 October 2015
I noticed that last line of the Apache 2.0 license is missing from the license version you are using.
Leaders and contributors
|F. Andy Seidl||Author|
Resources and communication
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.
The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.