edu.umass.cs.mallet.base.pipe.iterator
Class FileListIterator

java.lang.Object
  extended byedu.umass.cs.mallet.base.pipe.iterator.AbstractPipeInputIterator
      extended byedu.umass.cs.mallet.base.pipe.iterator.FileListIterator
All Implemented Interfaces:
java.util.Iterator, PipeInputIterator

public class FileListIterator
extends AbstractPipeInputIterator

An iterator that generates instances for a pipe from a list of filenames. Each file is treated as a text file whose target is determined by a user-specified regular expression pattern applied to the filename


Field Summary
static java.util.regex.Pattern ALL_DIRECTORIES
          Use as label names all the directory names in the filename.
static java.util.regex.Pattern FIRST_DIRECTORY
          Use as label names the first directory in the filename.
static java.util.regex.Pattern LAST_DIRECTORY
          Use as label name the last directory in the filename.
static java.util.regex.Pattern STARTING_DIRECTORIES
          Use as label names the directories of the given files, optionally removing common prefix of all starting directories
 
Fields inherited from class edu.umass.cs.mallet.base.pipe.iterator.AbstractPipeInputIterator
parentInstance
 
Constructor Summary
FileListIterator(java.io.File[] files, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
          Construct an iterator over the given arry of Files The instances constructed from the files are returned in the same order as they appear in the given array
FileListIterator(java.io.File filelist, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
          Construct a FileListIterator with the file containing the list of files, which contains one filename per line.
FileListIterator(java.lang.String[] filenames, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
           
FileListIterator(java.lang.String filelistName, java.io.FileFilter fileFilter, java.util.regex.Pattern targetPattern, boolean removeCommonPrefix)
           
FileListIterator(java.lang.String filelistName, java.util.regex.Pattern targetPattern)
           
 
Method Summary
 java.util.ArrayList getFileArray()
           
 boolean hasNext()
           
 java.io.File nextFile()
           
 Instance nextInstance()
           
 
Methods inherited from class edu.umass.cs.mallet.base.pipe.iterator.AbstractPipeInputIterator
next, remove, setParentInstance
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STARTING_DIRECTORIES

public static final java.util.regex.Pattern STARTING_DIRECTORIES
Use as label names the directories of the given files, optionally removing common prefix of all starting directories


FIRST_DIRECTORY

public static final java.util.regex.Pattern FIRST_DIRECTORY
Use as label names the first directory in the filename.


LAST_DIRECTORY

public static final java.util.regex.Pattern LAST_DIRECTORY
Use as label name the last directory in the filename.


ALL_DIRECTORIES

public static final java.util.regex.Pattern ALL_DIRECTORIES
Use as label names all the directory names in the filename.

Constructor Detail

FileListIterator

public FileListIterator(java.io.File[] files,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
Construct an iterator over the given arry of Files The instances constructed from the files are returned in the same order as they appear in the given array

Parameters:
files - Array of files from which to construct instances
fileFilter - class implementing interface FileFilter that will decide which names to accept. May be null.
targetPattern - regex Pattern applied to the filename whose first parenthesized group on matching is taken to be the target value of the generated instance. The pattern is applied to the filename with the matcher.find() method.
removeCommonPrefix - boolean that modifies the behavior of the STARTING_DIRECTORIES pattern, removing the common prefix of all initially specified directories, leaving the remainder of each filename as the target value.

FileListIterator

public FileListIterator(java.lang.String[] filenames,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)

FileListIterator

public FileListIterator(java.io.File filelist,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Construct a FileListIterator with the file containing the list of files, which contains one filename per line. The instances constructed from the filelist are returned in the same order as listed


FileListIterator

public FileListIterator(java.lang.String filelistName,
                        java.io.FileFilter fileFilter,
                        java.util.regex.Pattern targetPattern,
                        boolean removeCommonPrefix)
                 throws java.io.FileNotFoundException,
                        java.io.IOException

FileListIterator

public FileListIterator(java.lang.String filelistName,
                        java.util.regex.Pattern targetPattern)
                 throws java.io.FileNotFoundException,
                        java.io.IOException
Method Detail

nextInstance

public Instance nextInstance()
Specified by:
nextInstance in interface PipeInputIterator
Specified by:
nextInstance in class AbstractPipeInputIterator

nextFile

public java.io.File nextFile()

hasNext

public boolean hasNext()
Specified by:
hasNext in interface java.util.Iterator
Specified by:
hasNext in class AbstractPipeInputIterator

getFileArray

public java.util.ArrayList getFileArray()