Quantcast
Channel: Derek Illchuk - Computations » data structures
Viewing all articles
Browse latest Browse all 7

Aho-Corasick String Matching Dictionary

$
0
0
This is an implementation of the Aho-Corasick String Matching Dictionary (Wikipedia).

This algorithm lets you search for (all instances of) multiple substrings in a single pass through the haystack. This is done by first compiling the substrings into a finite-state machine (tree) which is fed each haystack character in turn, adjusts its (specific) state, and reports any matches. This runs in O(haystack_size + reported_size).

The implementation is in git and below:


Viewing all articles
Browse latest Browse all 7

Trending Articles