Revised PageRank

In the various references, Page et al present two solutions to these difficulties. The first, I call Revised PageRank (1)

In this first approach, we pick a vector E that is considered to be a vector of starting places (Page calls this "a source of rank" [2, p. 5]).

The revised equation for PageRank now becomes R = c A R + c E.

Unfortunately, at first glance solving this is no longer an eigenvalue/eigenvector problem.

Two observations and an algebraic trick!

Observation 1: Consider the approximation algorithm for eigenvectors (and eigenvalues) by taking computing powers Ak V, where V is an initial guess at the principal eigenvector. If both A and initial guess V have all non-negative values, then each refinement Ak V will continue to have all non-negative values — even if it is scaled by a positive scale factor (e.g., to get a unit vector).

Observation 2: If R (= (ri) is a vector with all non-negative entries, then
||R|| = |r1| + ... + |ri| + ... + |rN|
     = r1 + ... + ri + ... + rN
     = 1 * R, where 1 be row vector of all 1's.

Algebraic Trick: Assuming we take R to be a unit vector of only non-negative entries, then
R = c A R + c E
     = C A R + c E [1] ([1] is a one-by-one matrix)
     = c A R + c E * 1 * R
     = c(A + E * 1) R

Putting these pieces together, the observations suggest we may be able to assume that PageRank has only non-negative values (that seems consistent with the notion of ranking Web pages), and we can specify that the PageRank vector R has length 1. With these constraints,

Revised PageRank (1) is an eigenvector for the principal eigenvalue of the equation R = c(A + E*1) R

created: 10 February 2007
last revised: 5 March 2007
previous  next Valid HTML 4.01!