5,154,487 members and growing! (19,464 online)
Email Password   helpLost your password?
Web Development » ASP.NET » General     Beginner

Request Google´s Pagerank programmatically

By hartertobak

This article shows how google´s hashing algorithm works and how you can use it to check the pagerank of your sites
Javascript, XML, C# 2.0, C#, Windows, .NET, .NET 2.0, ASP.NET, WebForms, Ajax, VS2005, VS, Dev

Posted: 16 Aug 2007
Updated: 16 Aug 2007
Views: 6,508
Announcements



Search    
Advanced Search
Sitemap
5 votes for this Article.
Popularity: 2.54 Rating: 3.64 out of 5
0 votes, 0.0%
1
2 votes, 40.0%
2
0 votes, 0.0%
3
0 votes, 0.0%
4
3 votes, 60.0%
5
Note: This is an unedited contribution. If this article is inappropriate, needs attention or copies someone else's work without reference then please Report This Article

Download google-pagerank.zip - 515.4 KB
Screenshot - pagerank-cp.jpg

Introduction

Google´s PageRank (PR) is a "link analysis algorithm measuring the relative importance" (PR @wikipedia). The importance of PR nowadays is a lot lower, than one or two years ago. Never the less, PR is the only Ranking value, that is public to all audience, which means it´s the only factor with some transparency. For those, who don´t know: a PR of 10 is the highest around (like apple.com) and 0 the lowest - those sites, who don´t even have a PR of 0 are in a kind of sandbox (a special filter to punish the site) or not indexed by google.

Please forgive me for beeing lazy in the english lessons @ school as I´m trying my best :)

Background

Google tries to measure the relevance of a domain/site by counting the links pointing to the site/domain. This is influenced by the number of links, that link to the linking site - in fact this kind of procedure is an iterative process, which needs a lot computing power.

Many webmasters believe, their ranking depends on the PR of their site - this, today, is not true. PR never was the only factor for google´s ranking, but it was the most important factor. Right now, it´s not. And many people believe, that google tries to get rid of the RageRank, because link traders are measuring the value (in $) of a link by PR - which is just stupid.

If you´re interested in buying links, go with following factors:

  • Linkpopularity (how often is the site, you´re willing to buy a link from, linked?)
  • Domainpopularity (^ + by different domains)
  • IP-Popularity (^ + on different IPs)
  • has the Domain an "authority status"?
  • Is the content of the domain relevant for your content?
  • Has this domain a good ranking for keywords you wanna rank good at?
  • How many outgoing links does this site have?
Because PR is the one and only factor, we can have a look at, it´s pretty nice to check it. And it´s even more nice, if we can do that on more than one google Datacenter at the same time.

Requesting the PR

Well, the easy part it, how the PR get´s requestet: it´s just a simpel HTTP-Request, with a little problem in it: here´s the request for www.codeproject.com

//

 http://toolbarqueries.google.com/search?client=navclient-auto&hl=en&ch=6771535612&ie=UTF-8

    &oe=UTF-8&features=Rank&q=info:http%3A%2F%2Fwww.codeproject.com%2F
//

Well, this seems to be easy, but there´s this little

ch=6771535612 

which is a hash value, referencing the domain we want to get the PR for. This hashing algorithm was NOT developed by google, it´s the perfect hashing algorithm by Bob Jenkins

After some folks ported the code to php, I tried to do a port to C# - and here we go.

At first I need to mention, that (after I finished my coding) I found another port my Miroslav Stompar, which you can find here

To be honest, his port was better, so I modified my version and here comes the solution, that´s my favorite:

Ported to C#

"Google_Pagerank/google-pagerank.zip">Download google-pagerank.zip - 515.4 KBusing System;
using System.Collections.Generic;
using System.Text;  
using System.Text.RegularExpressions;
using System.Net;
using System.IO;

namespace GooglePR
{
    class GetPR
    {
        private const UInt32 myConst = 0xE6359A60;
        private static void _Hashing(ref UInt32 a, ref UInt32 b, ref UInt32 c)
        {
            a -= b; a -= c; a ^= c >> 13;
            b -= c; b -= a; b ^= a << 8;
            c -= a; c -= b; c ^= b >> 13;
            a -= b; a -= c; a ^= c >> 12;
            b -= c; b -= a; b ^= a << 16;
            c -= a; c -= b; c ^= b >> 5;
            a -= b; a -= c; a ^= c >> 3;
            b -= c; b -= a; b ^= a << 10;
            c -= a; c -= b; c ^= b >> 15;
        }
        public static string PerfectHash(string theURL)
        {
            url = string.Format("info:{0}", theURL);

            int length = url.Length;
            
            UInt32 a, b;
            UInt32 c = myConst;

            int k = 0;
            int len = length;

            a = b = 0x9E3779B9;

            while (len >= 12)
            {
                a += (UInt32)(url[k + 0] + (url[k + 1] << 8) + (url[k + 2] << 16) + (url[k + 3] << 24));
                b += (UInt32)(url[k + 4] + (url[k + 5] << 8) + (url[k + 6] << 16) + (url[k + 7] << 24));
                c += (UInt32)(url[k + 8] + (url[k + 9] << 8) + (url[k + 10] << 16) + (url[k + 11] << 24));
                _Hashing(ref a, ref b, ref c);
                k += 12;
                len -= 12;
            }
            c += (UInt32)length;
            switch (len) 
            {
                case 11: 
                    c += (UInt32)(url[k + 10] << 24); 
                    goto case 10;
                case 10: 
                    c += (UInt32)(url[k + 9] << 16); 
                    goto case 9;
                case 9: 
                    c += (UInt32)(url[k + 8] << 8); 
                    goto case 8;
                case 8: 
                    b += (UInt32)(url[k + 7] << 24); 
                    goto case 7;
                case 7: 
                    b += (UInt32)(url[k + 6] << 16); 
                    goto case 6;
                case 6: 
                    b += (UInt32)(url[k + 5] << 8); 
                    goto case 5;
                case 5: 
                    b += (UInt32)(url[k + 4]); 
                    goto case 4;
                case 4: 
                    a += (UInt32)(url[k + 3] << 24); 
                    goto case 3;
                case 3: 
                    a += (UInt32)(url[k + 2] << 16); 
                    goto case 2;
                case 2: 
                    a += (UInt32)(url[k + 1] << 8); 
                    goto case 1;
                case 1: 
                    a += (UInt32)(url[k + 0]); 
                    break;
                default: 
                    break;
            }
            
            _Hashing(ref a, ref b, ref c);

            return string.Format("6{0}", c);
        }

        public static int MyPR(string myURL)
        {
            string strDomainHash = PerfectHash(myURL);
            string myRequestURL = string.Format("http://toolbarqueries.google.com/
    search?client=navclient-auto&ch={0}&features=Rank&q=info:{1}", 
                strDomainHash, myURL);

            try
            {
                HttpWebRequest myRequest = (HttpWebRequest)WebRequest.Create(myRequestURL);
                string myResponse = new StreamReader(myRequest.GetResponse().GetResponseStream()).ReadToEnd();
                if (myResponse.Length == 0)
                    return 0;
                else
                    return int.Parse(Regex.Match(myResponse, "Rank_1:[0-9]:([0-9]+)").Groups[1].Value);
            }
            catch (Exception)
            {
                return -1;
            }
        }

    }
}
 
So many thx to Miroslav, who did the better job :)

Example: an ASP.NET Version

Here you can find the ASP.NET-Version of a PR-Checker - this one checks the PR of the given Domain/Site of different IPs, which means different google Datacenters. Because google only updates the shown PR (Toolbar PR) about every 3 months, this tool is nice to check, if there´s an update running - while the update runs, you´ll get different PRs for the same page (in case, that the PR raises or falls) - interesting, isn´t it?

To check more than one Datacenter, I just created a loop and dynamically replace the

toolbarqueries.google.com 

part of the request with a google IP - a list of IPs can be found via google :)

If the tool shows "-1" the PR couldn´t be retrieved, due to any reason

History

0.2 Uploaded Source-Code

I got to mention something first: if you´re using the uploaded example, then u´re
using the code by miro stampar - for some reasons my Code is blown up with other
things, I´m still working on. So don´t worry, why the code differs from the code in this article

0.1 correction of a variable name (myURL to url) - thx to CP-user ploufs :)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

hartertobak



Location: Germany Germany

Other popular ASP.NET articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 12 of 12 (Total in Forum: 12) (Refresh)FirstPrevNext
Subject  Author Date 
GeneralCode don't work and error on it (int length = url.Length;)memberploufs14:58 16 Aug '07  
AnswerRe: Code don't work and error on it (int length = url.Length;)memberhartertobak20:19 16 Aug '07  
GeneralNice. But one problemmemberIrfan Faruki5:51 16 Aug '07  
GeneralRe: Nice. But one problemmemberhartertobak10:27 16 Aug '07  
GeneralRe: Nice. But one problemmemberIrfan Faruki12:04 16 Aug '07  
GeneralRe: Nice. But one problemmemberhartertobak20:16 16 Aug '07  
GeneralSuggestionmemberGlen Harvy4:16 16 Aug '07  
GeneralRe: Suggestionmemberhartertobak4:31 16 Aug '07  
GeneralRe: SuggestionmemberGlen Harvy11:58 16 Aug '07  
GeneralRe: Suggestionmemberhartertobak20:38 16 Aug '07  
GeneralNice work.memberSecrets4:07 16 Aug '07  
GeneralRe: Nice work.memberhartertobak4:24 16 Aug '07  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 16 Aug 2007
Editor:
Copyright 2007 by hartertobak
Everything else Copyright © CodeProject, 1999-2008
Web15 | Advertise on the Code Project