Iterative proportional fitting

From Wikipedia Quality
(Redirected from Matrix raking)
Jump to: navigation, search

The iterative proportional fitting procedure (IPFP, also known as biproportional fitting in statistics, RAS algorithm in economics and matrix ranking or matrix scaling in computer science) is an iterative algorithm for estimating cell values of a contingency table such that the marginal totals remain fixed and the estimated table decomposes into an outer product.

IPF is a method that has been "re-invented" many times, e.g. G.U. Yule in 1912 in relation to standardizing cross-tabulations and Kruithof in 1937 in relation to telephone traffic ("Kruithof’s double factor method") , and expanded upon by Deming and Stephan in 1940 (they proposed IPFP as an algorithm leading to a minimizer of the Pearson X-squared statistic, which it does not, and even failed to prove convergence), it has seen various extensions and related research. A rigorous proof of convergence by means of differential geometry is due to Fienberg (1970). He interpreted the family of contingency tables of constant crossproduct ratios as a particular (IJ − 1)-dimensional manifold of constant interaction and showed that the IPFP is a fixed-point iteration on that manifold. Nevertheless, he assumed strictly positive observations. Generalization to tables with zero entries is still considered a hard and only partly solved problem.

An exhaustive treatment of the algorithm and its mathematical foundations can be found in the book of Bishop et al. (1975). The first general proof of convergence, built on non-trivial measure theoretic theorems and entropy minimization, is due to Csiszár (1975). Relatively new results on convergence and error behavior have been published by Pukelsheim and Simeone (2009) . They proved simple necessary and sufficient conditions for the convergence of the IPFP for arbitrary two-way tables (i.e. tables with zero entries) by analysing an

 -error function.

Other general algorithms can be modified to yield the same limit as the IPFP, for instance the Newton–Raphson method and the EM algorithm. In most cases, IPFP is preferred due to its computational speed, numerical stability and algebraic simplicity.