Today I read a paper titled “Ranking Pages by Topology and Popularity within Web Sites”
The abstract is:
We compare two link analysis ranking methods of web pages in a site.
The first, called Site Rank, is an adaptation of PageRank to the granularity of a web site and the second, called Popularity Rank, is based on the frequencies of user clicks on the outlinks in a page that are captured by navigation sessions of users through the web site.
We ran experiments on artificially created web sites of different sizes and on two real data sets, employing the relative entropy to compare the distributions of the two ranking methods.
For the real data sets we also employ a nonparametric measure, called Spearman’s footrule, which we use to compare the top-ten web pages ranked by the two methods.
Our main result is that the distributions of the Popularity Rank and Site Rank are surprisingly close to each other, implying that the topology of a web site is very instrumental in guiding users through the site.
Thus, in practice, the Site Rank provides a reasonable first order approximation of the aggregate behaviour of users within a web site given by the Popularity Rank.