PmSVM: Power Mean SVM

Home: Software from Jianxin Wu
- PmSVM: Power Mean SVM
- libHIK: clustering and classifying histograms
- C4: Real-time pedestrian detection

This page provides a manual for using the PmSVM software and reproducing the experiments in our CVPR 2012 paper. PmSVM is designed to solve large scale SVM problem, using the power mean kernel.

The power mean kernel is the kernel to use when features are histograms, which are common in computer vision. It can also be used as an alternative for fast and accurate classification, in place of linear SVM classifiers.

PmSVM can solve a binary classification problem with millions of examples and tens of thousands of dense features in a few seconds (excluding the time to read the input files.)

Important information is shown in this color. This is a mirror page for those who cannot visit Google Sites. Note that all downloads contained in this page are now in this single archive.

Contents

Background and licensing

The Power Mean SVM (PmSVM) software aims at solving large scale classification problems, especially those problems in computer vision. Nowadays, a typical large scale computer vision classification problem may have 1,000+ classes and more than 1 million examples. Efficient training and testing & accurate classification results are the key to the success of many vision problems.

Also, histograms are arguably the most popular visual representation. When the features are histograms, additive kernels are the most effective in SVM models. An additive kernel is a valid kernel function of the form:

PmSVM provides efficient classification with a special family of additive kernels, namely the power mean kernels, in the form (for any p<0):

Note that this family include as special cases the most popular additive kernels: when p = -1, it is the χ² kernel; and, when p = -∞, it is the histogram intersection kernel (HIK).

PmSVM is most effective when features are histograms; but it can also be used as an alternative for large scale linear SVM classification.

PmSVM is distributed under the BSD license.

Copyright (c) 2012, Jianxin Wu
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met: 

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer. 
2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution. 

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Technical papers

For technical details, please refer to:

Power 
            Mean SVM for Large Scale Visual Classification

[pdf]

Jianxin 
            Wu

In: Proc. 
            The IEEE Int'l Conference on Computer Vision and Pattern 
            Recognition (

CVPR 
            2012

), 
            Providence, USA, June 2012, pp. xx-xx.

If you use PmSVM, please cite the above paper.

Install

PmSVM is written in C++. It can be used in both Linux and Windows. The first step is to download the PmSVM.zip file from this page.

Install in Linux

Make sure that gcc / g++ are installed. Unzip PmSVM.zip to a directory. Then, run the following command from a command prompt:

cd PmSVM # Replace ‘PmSVM’ with your own directory name

g++ -O3 PmSVM.cpp -o pmsvm

The output is an executable named pmsvm.

Install in Windows

If you prefer MSVC++, you may manually create a VC++ solution project, and add PmSVM.cpp (inside PmSVM.zip) to this project. If using MSVC++, you may need to modify the following line at the beginning of PmSVM.cpp:

const float BIGVALUE = HUGE_VALF; # Replace ‘HUGE_VALF’ with ‘FLOAT_MAX’

An easier alternative is to use a gcc / g++ build in Windows, following these steps:

• Install TDM-GCC from http://tdm-gcc.tdragon.net/. Choose ‘tdm-gcc’ for creating a 32 bit executable; and ‘tdm64-gcc’ for the 64 bit version.

• Open the ‘MinGW Command Prompt’ from the Windows Start Menu.

• Run the following command in the MinGW command prompt:

cd PmSVM # Replace ‘PmSVM’ with your own directory name

g++ -O3 PmSVM.cpp -o pmsvm

The output is an executable named pmsvm.exe

Usage

Data Format and Requirement

PmSVM requires that training and testing data are stored in the libsvm format, described as in the libsvm documentation:

The format of training and testing data 
            file is:

<label> 
            <index1>:<value1> <index2>:<value2> 
            ...

.

Each line contains an instance and is 
            ended by a '\n' character.

For classification, <label> is an 
            integer indicating the class label (multi-class is 
            supported).

<index> is an integer starting from 
            1 and <value> is a real number. Indices must be in ASCENDING 
            order. Labels in the testing file are only used to calculate 
            accuracy or errors. If they are unknown, just fill the first column 
            with any numbers.

PmSVM requires <value> to be within the range [0, 1]. It indeed accepts negative or >1 feature values, but with the danger of reducing the accuracy.

In order to convert your data to the range [0, 1], you may use the svm-scale tool in the libsvm software package (http://www.csie.ntu.edu.tw/~cjlin/libsvm/):

./svm-scale -l 0 -u 1 -s range.txt trainset > trainset.scale # Remove ‘./’ in Windows; and replace ‘trainset’ with your own training data file name

./svm-scale -r range.txt testset > testset.scale # Replace 'testset' with your own testing data file name

Command Prompt Usage

We assume that the training data is stored in the file trainset.svm and testing data in testset.svm; the executable has been compiled and put in the current directory.

The following command train an SVM model from the training data; and also test it on the testing set:

./pmsvm trainset.svm testset.svm #Replace ‘./pmsvm’ with ‘pmsvm’ in Windows

If you only have 1 dataset stored in trainset.svm, 5-fold cross-validation is performed using:

./pmsvm trainset.svm #Replace ‘./pmsvm’ with ‘pmsvm’ in Windows

Two types of accuracy are reported by PmSVM. Suppose the dataset has m classes, with n₁, n₂, ⋯, n_m examples, respectively; n = n₁ + n₂ + ... + n_m is the dataset size; and c_i is the number of examples in class i that are correctly predicted. PmSVM reports the overall accuracy as:

and the average accuracy (average of the confusion matrix diagonals) as:

Overall accuracy is often used in machine learning papers; while average accuracy is more popular in computer vision. Thus, PmSVM reports both.

Changing Parameters

PmSVM is designed mainly for source code level reuse. You can change the parameters of PmSVM by first modifying the source code; and recompile the software. The following parameters can be adjusted in PmSVM:

which indexes a specific kernel in the power mean kernel family M_p. Default value is -1 (the chi-square kernel). If you want to use the histogram intersection kernel (HIK), use p = -8. You can change the p value by modifying the following lines in the main() function:

probtrain.Train(model_, 0.01, p); // for training followed by testing

probtrain.CrossValidation(5, 0.01, p); // for cross validation

HIK is slightly slower, but also slightly more accurate than χ² in PmSVM. Note that p < 0 is required.

the SVM regularization parameter. Default value is 0.01. You can change the C value by modifying the following lines in the main() function:

probtrain.Train(model_, C, -1); // for training followed by testing

probtrain.CrossValidation(5, C, -1); // for cross validation

PmSVM is not sensitive to C. The default value 0.01 works well for most problems. Note that a larger C requires longer training time.

Cross validation fold.

Default value is 5. You can change the fold value by modifying the following line in the main() function:

probtrain.CrossValidation(fold, 0.01, -1); // for cross validation

Data type.

<value> is of type NUMBER, which is represented as a float by default. If you want to use double instead, you can change it by uncomment this line at the beginning of PmSVM.cpp:

//#define __USE_DOUBLE // Remove the ‘//’ at the beginning to uncomment

Then, NUMBER is of the double type. In most cases, it is not necessary to use double (which is 8 bytes while float is 4 byte.)

Bias.

PmSVM adds one extra dimension with constant value 1 to the beginning of every example by default. If you want to change the bias value, you can modify the following lines in the main() function:

probtrain.Load(argv[1], bias);

probtest.Load(argv[2], bias, model_.nr_feature);

Note that a negative bias value is treated as 0.

Using the Source Code Directly

You may want to use the functions of PmSVM directly. This section provides instructions for this purpose.

PmSVM stored an example using two arrays indexes and values. For example, x = (0, 0.1, 0, 0, 0.3) is an example feature vector in R⁵ with two non-zero values, and it is represented in PmSVM as (x₂=0.1, x₅=0.3):

indexes(int*) 2 5 -1
values(NUMBER*) 0.1 0.3 any value

The index value -1 denotes the end of this example.

Class labels are integers, which may be discontinuous. For example, original class labels for a 3-class problem could be {1, -3, 7}. PmSVM internally converts them to continuous integers starting from 0 ( {0, 1, 2} in this example.)
Load a training set using the following line:

probtrain.Load(filename, bias);

If you do not provide a bias value, it is set to 0 by default.

Train a model using the following line:

probtrain.Train(model_, C, p);

The trained SVM model (binary or multi-class) is stored in model_. 1-vs-rest is used for multi-class classification.

Load a testing set using the following line:

probtest.Load(filename, bias, maxdim);

Note that bias must be the same as the bias value used while loading the corresponding training set.
maxdim must be the maximum feature dimension in the corresponding training set to avoid errors. You can use model_.nr_feature for this parameter.

Testing on a new example. A test example is represented by a pair: int* indexes and NUMBER* values. You can use the following line to predict:

ret = model_.Predict(indexes, values);

The return value ret is the predicted label in the original label set.

If you load the entire test set as PmSVM.cpp does, the i-th example’s indexes and values can be retrieved by function calls probtest.GetFeatureIndexes(i) and probtest.GetFeatureValues(i).

Testing & get decision values. PmSVM uses the 1-vs-rest strategy for multi-class problems. If you want the decision values of the m classifiers for a m-class problem, use:

ret = model_.PredictValues(indexes, values, dec_values);

Allocate a size m array for dec_values to store the m decision values.

The return value ret is the predicted label in the original label set.

Conversion between labels. The original label set is discontinuous, and is not convenient for computing complex performance metrics (e.g., area under the precision-recall curve.) You can convert a label in the original label set to {0, 1, …, m - 1} by using:

Newlabel = model_.FindLabel(OriginalLabel);

Compare with the feature mapping approach

We provide our implementation of the feature mapping approach by Vedaldi and Zisserman. This approach is compared against PmSVM in the CVPR 2012 paper.

Our implementation first use the VLFeat C source code to generate a table with 1000 row, each row will map a value between [0, 1] to a 3 dimensional short vector. This look-up table trick runs much faster than generating the mappings on-the-fly. Note that you need to make sure all feature values are within the [0, 1] range.

This mapping simulates either HIK or chi-square. Thus, a d dimensional additive kernel problem is transformed into a 3d dimensional linear classification problem. Our implementation then revise the LIBLINEAR source code to train and test a linear SVM.

This implementation is provided in PmSVM only as a reference. The BSD license applies to this implementation. Please also refer to LIBLINEAR and VLFeat for their respective licensing information.

It is only tested in Linux (Ubuntu 11.10, 64 bit) with gcc / g++.

To build / compile this implementation, first download the source code archive from this page. After unzip this archive, run the following commands in the mapping_code directory:

make lib

make

Four files are generated: hom_train_HIK, hom_predict_HIK (these two for HIK); and hom_train_Chi2, hom_predict_Chi2 (these two for chi-square). Use the train commands to train SVM models; and predict commands for prediction. Parameters can be set by options of these commands. The list of supported options are a subset of the LIBLIENAR software, which is displayed by running the respective command without specifying any option.

PmSVM is also compared with LIBLINEAR and libHIK. LIBLINEAR can be downloaded from here. libHIK can be downloaded from this site.

Generate sample datasets

This section provides instructions for generating the datasets used for evaluating PmSVM in the CVPR 2012 paper.

Caltech 101

Download the libHIK software (version 2.06) from here. Then, read the libHIK manual--libHIK-v2.pdf--inside libHIK;
Download the caltech 101 dataset and unzip properly to the libHIK-2.06/Data/ directory, following the instructions in the manual;
Edit libHIK-2.06/libHIK/Datasets.cpp, in the GenerateFilesForCaltech101() function as follows:

K = 2000; // Originally was 200

resizeWidth = 256;

kmeans_stepsize = 16;

splitlevel = 2;

scaleChanges = 4;

ratio = 2.0;

normalize = false;

stepSize = 2; // Originally was 8

useSobel = false;

useBoth = true; // Originally was false

oneclassSVM = true; // Originally was false

Also change static const int sizeCV = 5; at the beginning of this file to static const int sizeCV = 1;
Then, build / compile libHIK, by following instructions in the manual.

Finally, run the following command from the directory libHIK-2.06/libHIK/:

./generate-data caltech SIFT HIK
Two files will be generated: train1.txt (the training set, 674,856,198 bytes in Ubuntu), and test1.txt (the testing set, 890,787,408 bytes) in the current directory.

Note that it took more than 2 hours to finish, using all the 6 cores of an Intel Xeon 5670 CPU. You need at least 4G main memory.

Indoor 67

Download libHIK and make / compile it, following the instruction. (Refer to the above instructions for Caltech 101.)
Download the MIT indoor 67 dataset and unzip properly from http://web.mit.edu/torralba/www/indoor.html. The files needed are the following. First, download the database. Unzip the archive to libHIK-2.06/Data/indoor/, such that you will see a directory structure looking like libHIK-2.06/Data/indoor/airport_inside etc. Then, download the train/test split files. Put both files to the libHIK-2.06/Data/ directory.
Finally, run the following command from the directory libHIK-2.06/libHIK/:
./generate-data indoor CENTRIST HIK
Two files will be generated: train1.txt (the training set, 1,078,466,011 bytes in Ubuntu), and test1.txt (the testing set, 267,255,905 bytes) in the current directory.

Note that it took more than 3 hours to finish, using all the 6 cores of an Intel Xeon 5670 CPU. You need at least 8G main memory.

ILSVRC 1000

This is the largest dataset experimented in the CVPR 2012 paper. You may use the following steps to create the dataset.

Prerequisites

You need 32G main memory to run on the complete dataset. Otherwise, you need to adjust the "maximum example per category" parameter during dataset creation, to fit the data into your computer's memory. The CVPR 2012 paper used maximum 900 example per category (24G memory.)
Make sure you have > 30G hard disk space.
I only tested it in Linux (Ubuntu 11.10, 64 bit)

Steps

The unprocessed data are available at http://www.image-net.org/challenges/LSVRC/2010/download-public; you need to first create a directory with the name ILSVRC/

Download the "Development kit" archive. Extract the file /devkit-1.0/data/meta.mat from this archive, and save it as ILSVRC/meta.mat
Download the "Visual words (sbow) for training" file (5.1GB) ILSVRC2010_feature_sbow_train.tar and extract all files in this archive to a directory named as ILSVRC/train_feature/
Run ls -1 > files.txt in the ILSVRC/train_feature directory; Open files.txt, find the line showing the string files.txt, and delete this line
Download the "Visual words (sbow) for test" file (613MB) ILSVRC2010_feature_sbow_test.tar and extract all files in this archive to a directory named as ILSVRC/test_feature/
Run ls -1 > files.txt in the ILSVRC/test_feature directory; Open files.txt, find the line showing the string files.txt, and delete this line
Download the "Ground truth for test data" file ILSVRC2010_test_ground_truth.txt and save it as ILSVRC/ILSVRC2010_test_ground_truth.txt
Download BOW.m from this page, and save it as ILSVRC/BOW.m
Run BOW.m in Matlab from the ILSVRC/ directory; two files will be generated as train.txt and test.txt in the ILSVRC/ directory
Download Convert.cpp from this page, and save it as ILSVRC/Convert.cpp
Run gcc -O3 Convert.cpp -o a.out from the ILSVRC/ directory. Note that you may want to change line 13 const int TrainExamplePerClass = 900; to a different value that suits your computer
Run a.out from the ILSVRC/ directory
The final output are two files: train.svm (training set) and test.svm (test set) in the ILSVRC/ directory.