The library(isub) implements a similarity measure
between strings, i.e., something similar to the Levenshtein distance.
This method is based on the length of common substrings.
?- isub('E56.Language', 'languange', D, [normalize(true)]).
D = 0.4226950354609929. % [-1,1] range
?- isub('E56.Language', 'languange', D, [normalize(true),zero_to_one(true)]).
D = 0.7113475177304964. % [0,1] range
?- isub('E56.Language', 'languange', D, []). % without normalization
D = 0.19047619047619047. % [-1,1] range
?- isub(aa, aa, D, []). % does not work for short substrings
D = -0.8.
?- isub(aa, aa, D, [substring_threshold(0)]). % works with short substrings
D = 1.0. % but may give unwanted values
% between e.g. 'store' and 'spore'.
?- isub(joe, hoe, D, [substring_threshold(0)]).
D = 0.5315315315315314.
?- isub(joe, hoe, D, []).
D = -1.0.
This is a new version of isub/4 which replaces the old version while providing backwards compatibility. This new version allows several options to tweak the algorithm.
| Text1 | and Text2 are either an atom, string or a list of characters or character codes. |
| Similarity | is a float in the range [-1,1.0], where 1.0 means most similar. The range can be set to [0,1] with the zero_to_one option described below. |
| Options | is a list with elements described
below. Please note that the options are processed at compile time using
goal_expansion to provide much better speed. Supported options are:
|