Software Similar to MALLET
Software Similar to MALLET
There are numerous other software packages relevant to machine learning
and text that in various ways are related to MALLET:
- NLTK
(http://nltk.sourceforge.net)
also has "Classifier" and "ClassifierTrainer" classes for plug-and-play
classifiers, as well as implementations of Naive Bayes, MaxEnt,
feature selection, a "Token" class, finite state transducers
with iterators over transitions. In addition it has
facilities for tagging, parsing and information extraction.
- OpenNLP
(http://opennlp.sourceforge.net)
is also a Java package intended for text processing.
It also has rich Pipelines consisting of chains of individual pipe
Foo2Bar component steps that can be arbitrarily configured and
plugged together.
- JavaNLP (from
Chris Manning's group at Stanford.)
- BioJava. The BioJava
Project is an open-source project dedicated to providing Java tools
for processing biological data. This will include objects for
manipulating sequences, file parsers, CORBA interoperability, DAS,
access to ACeDB, dynamic programming, and simple statistical routines
to name just a few things.
- There is information about various Finite State Machine software at
http://www.cs.jhu.edu/~jason/405/software.html,
including pointers to AT&T Finite State package, which also has
very general finite state transducers with iterators over
transitions, arbitrary transition costs, generalized implementations
of Viterbi and Forward Backward. In
addition it has epsilon transitions, composition, and much more.
- Weka: Plug-and-play machine learning components in Java
http://www.cs.waikato.ac.nz/~ml/weka,
including classes for "Classifier", "NaiveBayes", "DecisionStump",
"LogisticRegression", etc. It also has methods for splitting
training sets, and nice evaluation tools, and GUI components to boot.
- Orange, component-based data mining software in C++ includes SVM,
logistic regression, clustering, and lots more.
http://magix.fri.uni-lj.si/orange/.
- Libbow
http://www.cs.cmu.edu/~mccallum/bow
also has mechanisms for feature extraction pipelines, plug-and-play classifiers,
feature selection, clustering. It is written in C.
- COLT: High Performance Scientific Computing in Java:
http://tilde-hoschek.home.cern.ch/~hoschek/colt.
- There are Java applets for various machine learning algorithms at
http://www.cse.unsw.edu.au/~cs9417.
- Bayesian Networks in Java:
http://www-2.cs.cmu.edu/~javabayes.
- Fe Sha and Fernando Pereira have written a CRF implementation
in Java.
- Jonathan Baxter wrote much
machine learning code in the late 1990's, including a rich "optimization" package.
- Ray Mooney has
also been writing machine learning code in Java lately.