edu.umass.cs.mallet.base.pipe
Class AddClassifierTokenPredictions

java.lang.Object
  extended byedu.umass.cs.mallet.base.pipe.Pipe
      extended byedu.umass.cs.mallet.base.pipe.AddClassifierTokenPredictions
All Implemented Interfaces:
java.io.Serializable

public class AddClassifierTokenPredictions
extends Pipe
implements java.io.Serializable

This pipe uses a Classifier to label each token (i.e., using 0-th order Markov assumption), then adds the predictions as features to each token. This pipe assumes the input Instance's data is of type FeatureVectorSequence (each an augmentable feature vector). Example usage:

 		1) Create and serialize a featurePipe that converts raw input to FeatureVectorSequences
 		2) Pipe input data through featurePipe, train a TokenClassifiers via cross validation, then serialize the classifiers
 		2) Pipe input data through featurePipe and this pipe (using the saved classifiers), and train a Transducer 
 		4) Serialize the trained Transducer 
 

See Also:
Serialized Form

Nested Class Summary
static class AddClassifierTokenPredictions.TokenClassifiers
          This inner class represents the trained token classifiers.
 
Constructor Summary
AddClassifierTokenPredictions(AddClassifierTokenPredictions.TokenClassifiers tokenClassifiers, int[] predRanks2add, boolean binary, InstanceList testList)
           
AddClassifierTokenPredictions(InstanceList trainList)
           
AddClassifierTokenPredictions(InstanceList trainList, InstanceList testList)
           
 
Method Summary
static InstanceList convert(InstanceList ilist, Noop alphabetsPipe)
          Converts each instance containing a FeatureVectorSequence to multiple instances, each containing an AugmentableFeatureVector as data.
static InstanceList convert(Instance inst, Noop alphabetsPipe)
           
 Alphabet getDataAlphabet()
           
 boolean getInProduction()
           
 Instance pipe(Instance carrier)
          Add the token classifier's predictions as features to the instance.
 void setInProduction(boolean inProduction)
           
static void setInProduction(Pipe p, boolean value)
           
 
Methods inherited from class edu.umass.cs.mallet.base.pipe.Pipe
getInstanceId, getParent, getParentRoot, getTargetAlphabet, isDataAlphabetSet, isTargetProcessing, pipe, readResolve, resolveDataAlphabet, resolveTargetAlphabet, setDataAlphabet, setParent, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

AddClassifierTokenPredictions

public AddClassifierTokenPredictions(InstanceList trainList)

AddClassifierTokenPredictions

public AddClassifierTokenPredictions(InstanceList trainList,
                                     InstanceList testList)

AddClassifierTokenPredictions

public AddClassifierTokenPredictions(AddClassifierTokenPredictions.TokenClassifiers tokenClassifiers,
                                     int[] predRanks2add,
                                     boolean binary,
                                     InstanceList testList)
Method Detail

setInProduction

public void setInProduction(boolean inProduction)

getInProduction

public boolean getInProduction()

setInProduction

public static void setInProduction(Pipe p,
                                   boolean value)

getDataAlphabet

public Alphabet getDataAlphabet()
Overrides:
getDataAlphabet in class Pipe

pipe

public Instance pipe(Instance carrier)
Add the token classifier's predictions as features to the instance. This method assumes the input instance contains FeatureVectorSequence as data

Specified by:
pipe in class Pipe
Parameters:
carrier - Instance to be processed.

convert

public static InstanceList convert(InstanceList ilist,
                                   Noop alphabetsPipe)
Converts each instance containing a FeatureVectorSequence to multiple instances, each containing an AugmentableFeatureVector as data.

Parameters:
ilist - Instances with FeatureVectorSequence as data field
alphabetsPipe - a Noop pipe containing the data and target alphabets for the resulting InstanceList
Returns:
an InstanceList where each Instance contains one Token's AugmentableFeatureVector as data

convert

public static InstanceList convert(Instance inst,
                                   Noop alphabetsPipe)
Parameters:
inst - input instance, with FeatureVectorSequence as data.
alphabetsPipe - a Noop pipe containing the data and target alphabets for the resulting InstanceList and AugmentableFeatureVectors
Returns:
list of instances, each with one AugmentableFeatureVector as data