edu.umass.cs.mallet.base.pipe
Class CharSequence2TokenSequence
java.lang.Object
edu.umass.cs.mallet.base.pipe.Pipe
edu.umass.cs.mallet.base.pipe.CharSequence2TokenSequence
- All Implemented Interfaces:
- java.io.Serializable
- public class CharSequence2TokenSequence
- extends Pipe
- implements java.io.Serializable
Pipe that tokenizes a character sequence. Expects a CharSequence
in the Instance data, and converts the sequence into a token
sequence using the given regex or CharSequenceLexer.
(The regex / lexer should specify what counts as a token.)
- See Also:
- Serialized Form
Methods inherited from class edu.umass.cs.mallet.base.pipe.Pipe |
getDataAlphabet, getInstanceId, getParent, getParentRoot, getTargetAlphabet, isDataAlphabetSet, isTargetProcessing, pipe, readResolve, resolveDataAlphabet, resolveTargetAlphabet, setDataAlphabet, setParent, setTargetAlphabet, setTargetProcessing |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CharSequence2TokenSequence
public CharSequence2TokenSequence(CharSequenceLexer lexer)
CharSequence2TokenSequence
public CharSequence2TokenSequence(java.lang.String regex)
CharSequence2TokenSequence
public CharSequence2TokenSequence(java.util.regex.Pattern regex)
CharSequence2TokenSequence
public CharSequence2TokenSequence()
pipe
public Instance pipe(Instance carrier)
- Description copied from class:
Pipe
- Process an Instance. This method takes an input Instance,
destructively modifies it in some way, and returns it.
This is the method by which all pipes are eventually run.
One can create a new concrete subclass of Pipe simply by
implementing this method.
- Specified by:
pipe
in class Pipe
- Parameters:
carrier
- Instance to be processed.
main
public static void main(java.lang.String[] args)