Fuzzy Matching Code in T-SQL
Use a dictionary of words or other strings that you would like to fuzzy match.
For this example I used https://github.com/zeisler/scrabble/blob/master/db/dictionary.csv
Order of SQL for Setup:
- F_Levenshtein.sql - altered version of https://blog.softwx.net/2014/12/optimizing-levenshtein-algorithm-in-tsql.html
- BK_Tree.sql
- P_BK_Tree_Insert.sql
- BK_Tree_Triggers.sql
- P_BK_Tree_Search.sql
To initialize the tree, you can run a normal insert call(the insert trigger will take care of the P_BK_Tree_Insert calls) such as:
insert into BK_Tree (word) select word from Dictionary
BK-Tree insert computes in O(N) time - took ~2 hours for the 172k dictionary.
BK-Tree search computes in O(log N) time
To Remove a word from the tree, run: UPDATE BK_Tree SET active = 0 WHERE word = 'word'
Some example outputs of P_BK_Tree_Search on the Scrabble Words Dictionary of words and the runtimes: