edu.umass.cs.mallet.base.types
Class StringKernel

java.lang.Object
  extended byjava.util.AbstractMap
      extended byjava.util.HashMap
          extended byjava.util.LinkedHashMap
              extended byedu.umass.cs.mallet.base.types.StringKernel
All Implemented Interfaces:
java.lang.Cloneable, java.util.Map, java.io.Serializable

public class StringKernel
extends java.util.LinkedHashMap

Computes a similarity metric between two strings, based on counts of common subsequences of characters. See Lodhi et al "String kernels for text classification." Optionally caches previous kernel computations.

See Also:
Serialized Form

Constructor Summary
StringKernel()
           
StringKernel(boolean norm, double lam, int length)
           
StringKernel(boolean norm, double lam, int length, boolean cache)
           
 
Method Summary
 double K(java.lang.String s, java.lang.String t)
          Computes the normalized string kernel between two strings.
static void main(java.lang.String[] args)
          Return string kernel between two strings
 
Methods inherited from class java.util.LinkedHashMap
clear, containsValue, get, removeEldestEntry
 
Methods inherited from class java.util.HashMap
clone, containsKey, entrySet, isEmpty, keySet, put, putAll, remove, size, values
 
Methods inherited from class java.util.AbstractMap
equals, hashCode, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.Map
equals, hashCode
 

Constructor Detail

StringKernel

public StringKernel(boolean norm,
                    double lam,
                    int length,
                    boolean cache)
Parameters:
norm - true if we lowercase all strings
lam - 0-1 penalty for gaps between matches.
length - max length of subsequences to compare
cache - true if we should cache previous kernel computations. recommended!

StringKernel

public StringKernel()

StringKernel

public StringKernel(boolean norm,
                    double lam,
                    int length)
Method Detail

K

public double K(java.lang.String s,
                java.lang.String t)
Computes the normalized string kernel between two strings.

Parameters:
s - string 1
t - string 2
Returns:
0-1 value, where 1 is exact match.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Return string kernel between two strings

Throws:
java.lang.Exception