Wednesday, September 12, 2012

Writing Custom Kernel Functions in Java for LibSVM

For my research on protein-protein interaction extraction I had to experiment with several different custom kernel functions. For that I looked into two most prevalent support vector machine libraries - SVMLight and LibSVM. In SVMLight one can plug in a custom kernel function through the kernel.h header file. LibSVM on the other hand does not allow custom kernel functions directly; however, one can pre-compute the kernel matrix (or Gram matrix) beforehand and feed it as input to the SVM. To me it seemed SVMLight would be the way to go. But then I found that LibSVM comes with an official Java implementation. I looked for a library that modifies that Java port to allow direct integration of kernel functions. I found jlibsvm which might have worked if I had found a little documentation in it. Then I decided to write a lightly refactored LibSVM on my own. Without much effort I have done that and am using it ever since. If you prefer to write your custom kernel functions in Java you can give it a try:
https://github.com/syeedibnfaiz/libsvm-java-kernel.git 

Writing a kernel function can not be easier. All you have to do is to implement the CustomKernel interface. Here is how you can write a linear kernel:
 /**  
  * <code>LinearKernel</code> implements a linear kernel function.  
  * @author Syeed Ibn Faiz  
  */  
 public class LinearKernel implements CustomKernel {  
   @Override  
   public double evaluate(svm_node x, svm_node y) {              
     if (!(x.data instanceof SparseVector) || !(y.data instanceof SparseVector)) {  
       throw new RuntimeException("Could not find sparse vectors in svm_nodes");  
     }      
     SparseVector v1 = (SparseVector) x.data;  
     SparseVector v2 = (SparseVector) y.data;  
     return v1.dot(v2);  
   }    
 }  

The kernel function you want to use should then be registered with the KernelManager. The following code snippet may give you a better idea of the whole work flow:
 public static void testLinearKernel(String[] args) throws IOException, ClassNotFoundException {  
     String trainFileName = args[0];  
     String testFileName = args[1];  
     String outputFileName = args[2];  
       
     //Read training file  
     Instance[] trainingInstances = DataFileReader.readDataFile(trainFileName);      
       
     //Register kernel function  
     KernelManager.setCustomKernel(new LinearKernel());      
       
     //Setup parameters  
     svm_parameter param = new svm_parameter();          
       
     //Train the model  
     System.out.println("Training started...");  
     svm_model model = SVMTrainer.train(trainingInstances, param);  
     System.out.println("Training completed.");              
       
     //Read test file  
     Instance[] testingInstances = DataFileReader.readDataFile(testFileName);  
     //Predict results  
     double[] predictions = SVMPredictor.predict(testingInstances, model, true);    
   }  

2 comments:

  1. Hi I am trying to use svm for text classificaton of resumees. I am not getting any clue on how i should proceed. Please suggest some pinters

    ReplyDelete
  2. Hi, I am trying to plug custom kernel to SVM in SVM Light. I have pre-computed gram matrix. I am not getting where do I plug this gram matrix in the code. Also how do I use computed kernel for training data.

    ReplyDelete