Click here to Skip to main content
15,881,803 members
Articles / Web Development / ASP.NET

Request Google's Page-rank Programmatically

Rate me:
Please Sign up or sign in to vote.
4.06/5 (10 votes)
15 Aug 2007CPOL4 min read 200.9K   3.6K   42   34
This article shows how Google's hashing algorithm works and how you can use it to check the page-rank of your sites.

Screenshot - pagerank-cp.jpg

Introduction

Google's PageRank (PR) is a "link analysis algorithm measuring the relative importance" (PR @wikipedia). The importance of PR nowadays is a lot lower than one or two years ago. Nevertheless, PR is the only ranking value that is public to all audience, which means it's the only factor with some transparency. For those who don't know: a PR of 10 is the highest around (like for apple.com), and 0 the lowest - those sites who don't even have a PR of 0 are in a kind of sandbox (a special filter to punish the site) or not indexed by Google.

Please forgive me for being lazy during English lessons @ school as I'm trying my best :)

Background

Google tries to measure the relevance of a domain/site by counting the links pointing to the site/domain. This is influenced by the number of links that link to the linking site - in fact, this kind of procedure is an iterative process, which needs a lot of computing power.

Many webmasters believe their ranking depends on the PR of their site - this, today, is not true. PR never was the only factor for Google's ranking, but it was the most important factor. Right now, it's not. And many people believe that Google tries to get rid of the PageRank because link traders are measuring the value (in $) of a link by PR - which is just stupid.

If you're interested in buying links, go with the following factors:

  • Link popularity (how often is the site, you're willing to buy a link from, linked?)
  • Domain popularity (^ + by different domains)
  • IP-popularity (^ + on different IPs)
  • Has the domain an "authority status"?
  • Is the content of the domain relevant for your content?
  • Has this domain a good ranking for keywords you want to rank good at?
  • How many outgoing links does this site have?

Because PR is the one and only factor we can have a look at, it's pretty nice to check it. And it's even more nice if we can do that on more than one Google data center at the same time.

Requesting the PR

Well, the easy part is the PR get request: it's just a simple HTTP-Request, with a little problem in it. Here's the request for www.codeproject.com:

http://toolbarqueries.google.com/search?client=navclient-auto&hl=en&ch=6771535612&ie=UTF-8
    &oe=UTF-8&features=Rank&q=info:http%3A%2F%2Fwww.codeproject.com%2F

Well, this seems to be easy, but there's this little:

ch=6771535612

which is a hash value referencing the domain we want to get the PR for. This hashing algorithm was not developed by Google, it's the perfect hashing algorithm by Bob Jenkins.

After some folks ported the code to PHP, I tried to do a port to C# - and here we go.

But before that, I need to mention that (after I finished my coding) I found another port by Miroslav Stompar, which you can find here.

To be honest, his port is better, so I modified my version, and here comes the solution that's my favorite: .

Ported to C#

C#
using System;
using System.Collections.Generic;
using System.Text;  
using System.Text.RegularExpressions;
using System.Net;
using System.IO;

namespace GooglePR
{
    class GetPR
    {
        private const UInt32 myConst = 0xE6359A60;
        private static void _Hashing(ref UInt32 a, ref UInt32 b, ref UInt32 c)
        {
            a -= b; a -= c; a ^= c >> 13;
            b -= c; b -= a; b ^= a << 8;
            c -= a; c -= b; c ^= b >> 13;
            a -= b; a -= c; a ^= c >> 12;
            b -= c; b -= a; b ^= a << 16;
            c -= a; c -= b; c ^= b >> 5;
            a -= b; a -= c; a ^= c >> 3;
            b -= c; b -= a; b ^= a << 10;
            c -= a; c -= b; c ^= b >> 15;
        }
        public static string PerfectHash(string theURL)
        {
            url = string.Format("info:{0}", theURL);

            int length = url.Length;
            
            UInt32 a, b;
            UInt32 c = myConst;

            int k = 0;
            int len = length;

            a = b = 0x9E3779B9;

            while (len >= 12)
            {
                a += (UInt32)(url[k + 0] + (url[k + 1] << 8) + 
                     (url[k + 2] << 16) + (url[k + 3] << 24));
                b += (UInt32)(url[k + 4] + (url[k + 5] << 8) + 
                     (url[k + 6] << 16) + (url[k + 7] << 24));
                c += (UInt32)(url[k + 8] + (url[k + 9] << 8) + 
                     (url[k + 10] << 16) + (url[k + 11] << 24));
                _Hashing(ref a, ref b, ref c);
                k += 12;
                len -= 12;
            }
            c += (UInt32)length;
            switch (len) 
            {
                case 11: 
                    c += (UInt32)(url[k + 10] << 24); 
                    goto case 10;
                case 10: 
                    c += (UInt32)(url[k + 9] << 16); 
                    goto case 9;
                case 9: 
                    c += (UInt32)(url[k + 8] << 8); 
                    goto case 8;
                case 8: 
                    b += (UInt32)(url[k + 7] << 24); 
                    goto case 7;
                case 7: 
                    b += (UInt32)(url[k + 6] << 16); 
                    goto case 6;
                case 6: 
                    b += (UInt32)(url[k + 5] << 8); 
                    goto case 5;
                case 5: 
                    b += (UInt32)(url[k + 4]); 
                    goto case 4;
                case 4: 
                    a += (UInt32)(url[k + 3] << 24); 
                    goto case 3;
                case 3: 
                    a += (UInt32)(url[k + 2] << 16); 
                    goto case 2;
                case 2: 
                    a += (UInt32)(url[k + 1] << 8); 
                    goto case 1;
                case 1: 
                    a += (UInt32)(url[k + 0]); 
                    break;
                default: 
                    break;
            }
            
            _Hashing(ref a, ref b, ref c);

            return string.Format("6{0}", c);
        }

        public static int MyPR(string myURL)
        {
            string strDomainHash = PerfectHash(myURL);
            string myRequestURL = string.Format("http://toolbarqueries.google.com/" + 
                   "search?client=navclient-auto&ch={0}&features=Rank&q=info:{1}", 
                   strDomainHash, myURL);

            try
            {
                HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(myRequestURL);
                string myResponse = new StreamReader(
                       myRequest.GetResponse().GetResponseStream()).ReadToEnd();
                if (myResponse.Length == 0)
                    return 0;
                else
                    return int.Parse(Regex.Match(myResponse, 
                           "Rank_1:[0-9]:([0-9]+)").Groups[1].Value);
            }
            catch (Exception)
            {
                return -1;
            }
        }

    }
}

So many thanks to Miroslav, who did the better job :)

Example: An ASP.NET Version

Here you can find the ASP.NET version of a PR-Checker - this one checks the PR of a domain/site of different IPs, which means different Google data centers. Because Google only updates the shown PR (Toolbar PR) about every 3 months, this tool is nice to check, if there's an update running - while the update runs, you'll get different PRs for the same page (in case the PR raises or falls) - interesting, isn't it?

To check more than one data center, I just created a loop and dynamically replace the:

toolbarqueries.google.com

part of the request with a Google IP - a list of IPs can be found via Google :)

If the tool shows "-1", the PR couldn't be retrieved, due to any reason.

History

  • 0.2 - Uploaded source code
  • I got to mention something first: if you're using the uploaded example, then you are using the code by miro stampar - for some reasons, my code is blown up with other things, and I'm still working on it. So don't worry about why the code differs from the code in this article.

  • 0.1 - Correction of a variable name (myURL to url) - thx to CP-user ploufs :)

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionNo longer working Pin
Adrenaline9631-Mar-18 2:25
Adrenaline9631-Mar-18 2:25 
AnswerMaking this work in 2013 Pin
pravin4work3-Mar-13 7:09
pravin4work3-Mar-13 7:09 
GeneralRe: Making this work in 2013 Pin
Abhishek Pant4-Sep-14 4:15
professionalAbhishek Pant4-Sep-14 4:15 
QuestionCode to get Page Rank based on certain keyword or search term Pin
harish kkumar12-May-12 0:57
harish kkumar12-May-12 0:57 
BugThis is failing when there are special characters in the url like & Pin
MonsterMMORPG17-Jan-12 10:40
MonsterMMORPG17-Jan-12 10:40 
BugPage Rank is not working Pin
Mohsan Hassan26-Nov-11 8:13
Mohsan Hassan26-Nov-11 8:13 
GeneralGoogle limit Pin
deepspacecoder16-May-11 6:17
deepspacecoder16-May-11 6:17 
GeneralVery nice (5) Pin
Aron Weiler6-Mar-11 8:51
Aron Weiler6-Mar-11 8:51 
GeneralThanks very much Pin
jimbo809824-Sep-10 2:23
jimbo809824-Sep-10 2:23 
GeneralGoogle is disabling the IP's Pin
empee18-Feb-10 20:06
empee18-Feb-10 20:06 
GeneralRe: Google is disabling the IP's Pin
jimbo809824-Sep-10 2:24
jimbo809824-Sep-10 2:24 
GeneralJava Pin
Kris Reid27-Jan-10 7:41
Kris Reid27-Jan-10 7:41 
GeneralRe: Java Pin
Vamsi Krishna Bandi9-Oct-10 20:32
Vamsi Krishna Bandi9-Oct-10 20:32 
Generalhi nice work Pin
abu subh29-Mar-09 10:25
abu subh29-Mar-09 10:25 
QuestionWhat about keywords? Pin
shaychen4-Mar-09 1:20
shaychen4-Mar-09 1:20 
AnswerRe: What about keywords? Pin
hartertobak4-Mar-09 1:49
hartertobak4-Mar-09 1:49 
Well - PageRank has nothing to do with keywords, it´s just about URLs.
The service you´ve mentioned is a Google SERP-Scraper, that displays the PageRank for the first results for a specific keyword (searchterm).
To achieve the same functionality you need to parse googles search engine result pages (SERP) and get the PageRank for every URL in the resultset.


GeneralRe: What about keywords? Pin
shaychen4-Mar-09 2:32
shaychen4-Mar-09 2:32 
GeneralRe: What about keywords? Pin
hartertobak4-Mar-09 4:28
hartertobak4-Mar-09 4:28 
GeneralRe: What about keywords? Pin
shaychen4-Mar-09 5:52
shaychen4-Mar-09 5:52 
General403 on Server 2003 Pin
joehinder5-Jan-09 7:15
joehinder5-Jan-09 7:15 
GeneralGreat source Pin
jeffwow2-Jul-08 13:13
jeffwow2-Jul-08 13:13 
GeneralCode don't work and error on it (int length = url.Length;) Pin
ploufs16-Aug-07 13:58
ploufs16-Aug-07 13:58 
AnswerRe: Code don't work and error on it (int length = url.Length;) Pin
hartertobak16-Aug-07 19:19
hartertobak16-Aug-07 19:19 
GeneralNice. But one problem Pin
Irfan Faruki16-Aug-07 4:51
Irfan Faruki16-Aug-07 4:51 
GeneralRe: Nice. But one problem Pin
hartertobak16-Aug-07 9:27
hartertobak16-Aug-07 9:27 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.