edu.umass.cs.mallet.base.pipe
Class SGML2TokenSequence

java.lang.Object
  extended byedu.umass.cs.mallet.base.pipe.Pipe
      extended byedu.umass.cs.mallet.base.pipe.SGML2TokenSequence
All Implemented Interfaces:
java.io.Serializable

public class SGML2TokenSequence
extends Pipe
implements java.io.Serializable

Converts a string containing simple SGML tags into a dta TokenSequence of words, paired with a target TokenSequence containing the SGML tags in effect for each word. It does not handle nested SGML tags, nor gracefully handle malformed SGML.

See Also:
Serialized Form

Constructor Summary
SGML2TokenSequence()
           
SGML2TokenSequence(CharSequenceLexer lexer, java.lang.String backgroundTag)
           
SGML2TokenSequence(CharSequenceLexer lexer, java.lang.String backgroundTag, boolean saveSource)
           
SGML2TokenSequence(java.lang.String regex, java.lang.String backgroundTag)
           
 
Method Summary
static void main(java.lang.String[] args)
           
 Instance pipe(Instance carrier)
          Process an Instance.
 
Methods inherited from class edu.umass.cs.mallet.base.pipe.Pipe
getDataAlphabet, getInstanceId, getParent, getParentRoot, getTargetAlphabet, isDataAlphabetSet, isTargetProcessing, pipe, readResolve, resolveDataAlphabet, resolveTargetAlphabet, setDataAlphabet, setParent, setTargetAlphabet, setTargetProcessing
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SGML2TokenSequence

public SGML2TokenSequence(CharSequenceLexer lexer,
                          java.lang.String backgroundTag,
                          boolean saveSource)

SGML2TokenSequence

public SGML2TokenSequence(CharSequenceLexer lexer,
                          java.lang.String backgroundTag)

SGML2TokenSequence

public SGML2TokenSequence(java.lang.String regex,
                          java.lang.String backgroundTag)

SGML2TokenSequence

public SGML2TokenSequence()
Method Detail

pipe

public Instance pipe(Instance carrier)
Description copied from class: Pipe
Process an Instance. This method takes an input Instance, destructively modifies it in some way, and returns it. This is the method by which all pipes are eventually run.

One can create a new concrete subclass of Pipe simply by implementing this method.

Specified by:
pipe in class Pipe
Parameters:
carrier - Instance to be processed.

main

public static void main(java.lang.String[] args)