*Page Ranking

July 12, 2024

Introduction

PageRank is an algorithm used by Google Search to rank web pages in their search engine results. It works by counting the number and quality of links to a page to determine how important the page is.

The idea is simple: A page is important if other important pages link to it.

In this post, we’ll break down the logic and code of the PageRank algorithm with a simple Python implementation.

Graph Structure

We’ll represent the web using a directed graph, where:

  • Nodes represent pages.
  • Edges represent links from one page to another.

Example:

Let’s say we have 4 pages — A, B, C, D — with the following links:

  • A → B, C
  • B → C
  • C → A
  • D → C

This can be written as:

graph = {
    "A": ["B", "C"],
    "B": ["C"],
    "C": ["A"],
    "D": ["C"]
}