Today I read a paper titled “Linear and Order Statistics Combiners for Pattern Classification”
The abstract is:
Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers.
This chapter provides an analytical framework to quantify the improvements in classification results due to combining.
The results apply to both linear combiners and order statistics combiners.
We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary.
Combining classifiers in output space reduces this variance, and hence reduces the “added” error.
If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated.
Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified.
For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance.
The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space.
Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.