Class Cosine
java.lang.Object
info.debatty.java.stringsimilarity.ShingleBased
info.debatty.java.stringsimilarity.Cosine
- All Implemented Interfaces:
NormalizedStringDistance, NormalizedStringSimilarity, StringDistance, StringSimilarity, Serializable
@Immutable
public class Cosine
extends ShingleBased
implements NormalizedStringDistance, NormalizedStringSimilarity
The similarity between the two strings is the cosine of the angle between
these two vectors representation. It is computed as V1 . V2 / (|V1| * |V2|)
The cosine distance is computed as 1 - cosine similarity.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionfinal doubleReturn 1.0 - similarity.private static doubleprivate static doubleCompute the norm L2 : sqrt(Sum_i( v_i²)).final doublesimilarity(String s1, String s2) Compute the cosine similarity between strings.final doubleCompute similarity between precomputed profiles.Methods inherited from class ShingleBased
getK, getProfile
-
Constructor Details
-
Cosine
public Cosine(int k) Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors.- Parameters:
k-
-
Cosine
public Cosine()Implements Cosine Similarity between strings. The strings are first transformed in vectors of occurrences of k-shingles (sequences of k characters). In this n-dimensional space, the similarity between the two strings is the cosine of their respective vectors. Default k is 3.
-
-
Method Details
-
similarity
Compute the cosine similarity between strings.- Specified by:
similarityin interfaceStringSimilarity- Parameters:
s1- The first string to compare.s2- The second string to compare.- Returns:
- The cosine similarity in the range [0, 1]
- Throws:
NullPointerException- if s1 or s2 is null.
-
norm
-
dotProduct
-
distance
Return 1.0 - similarity.- Specified by:
distancein interfaceStringDistance- Parameters:
s1- The first string to compare.s2- The second string to compare.- Returns:
- 1.0 - the cosine similarity in the range [0, 1]
- Throws:
NullPointerException- if s1 or s2 is null.
-
similarity
-