Saturday, September 29, 2012

Mallet and LibSVM

Mallet and LibSVM are the two machine learning libraries that I have been using the most. I felt the need of a way to directly use LibSVM from Mallet. As I mentioned in another post, I made a lightly refactored version of the Java implementation of LibSVM mainly for easy integration of custom kernel functions. Doing that gave me a better understanding of how LibSVM works and consequently helped me to integrate it with Mallet.
For classification tasks a Mallet instance pipe creates a FeatureVector out of an instance. So, it is quite straight forward to transform it into a format suitable for LibSVM. However, custom kernel functions that work on data structures other than vectors need to be handled differently. In the current version I have not kept any option for providing any arbitrary data structure from the Mallet end, however the code can be easily tweaked for that.
Mallet and LibSVM being separate libraries handle class labels differently. All I had to do in SVMClassifier is to align the class labels and scores from these two libraries. I have kept an option to tell LibSVM whether to predict probabilities or not which is required if you not only need the best class but also the scores given to the other classes.
If you are interested get it from github. Let me know if you have any suggestion.

No comments:

Post a Comment