MALLET

MALLET: A Machine Learning for Language Toolkit

page last updated 16 November 2005

MALLET is an integrated collection of Java code useful for statistical natural language processing, document classification, clustering, information extraction, and other machine learning applications to text.

Documentation for MALLET is available at the MALLET web page. In particular there are tutorials on using MALLET as well as on using the Jython language to write scripts for Mallet.

Mallet was written by Andrew McCallum, with contributions from several graduate students and staff, including Aron Culotta, Al Hough, Wei Li, David Pinto, Charles Sutton, and Jerod Wienman, at University of Massachusetts Amherst, as well as contributions from Fernando Pereira, Ryan McDonald, and others at University of Pennsylvania.

This work was supported in part by the Center for Intelligent Information Retrieval, and in part by SPAWARSYSCEN-SD grant number N66001-02-1-8903, in part by Advanced Research and Development Activity under contract number MDA904-01-C-0984,in part by The Central Intelligence Agency, the National Security Agency andNational Science Foundation under NSF grant #IIS-0326249, and in part by the Defense Advanced Research Projects Agency (DARPA), through the Department of the Interior, NBC, Acquisition Services Division, under contract number NBCHD030010. Work on and using MALLET at the University of Pennsylvania is funded by NSF grants EIA-0205448 and EIA-0205456 as well as CALO.

Citation

You are welcome to use the code under the terms of the licence for research or commercial purposes, however please acknowledge its use with a citation:
   McCallum, Andrew Kachites.  "MALLET: A Machine Learning for
   Language Toolkit."	http://mallet.cs.umass.edu
   2002.
Here is a BiBTeX entry:
   @unpublished{McCallumMALLET,
      author = "Andrew Kachites McCallum",
      title = "MALLET: A Machine Learning for Language Toolkit",
      note = "http://mallet.cs.umass.edu",
      year = 2002}