Skip to main content

Command Palette

Search for a command to run...

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

Learn How Browsers Turn an Enter Press into a Visible Webpage

Published
5 min read
How a Browser Works: A Beginner-Friendly Guide to Browser Internals

When You type a website address into your browser, press Enter, and within seconds a complete webpage appears on your screen.

But what actually happens during those few seconds?

The browser is doing far more than simply “opening a website.” It is coordinating many components, each with a specific role, working together to turn raw code into something you can see and interact with.

What Happens After I Type a URL and Press Enter?

When you press Enter, the browser starts a chain of actions:

  1. It figures out where the website lives on the internet

  2. It requests the website’s files from a server

  3. It reads and understands the code it receives

  4. It turns that code into a visible page

All of this happens very quickly ,but each step is important.

Main Parts of a Browser (High-Level Overview)

A browser is not one single block. It is a collection of components, each responsible for a specific task.

At a high level, the main parts include:

  • User Interface

  • Browser Engine

  • Rendering Engine

  • Networking

  • JavaScript Engine

  • Data Storage

User Interface:

The User Interface is the visible part of the browser. It is the part of the Browser with which the user directly interacts with .

This includes:

  • Address bar (URL bar)

  • Tabs

  • Back and forward buttons

  • Refresh button

This layer simply takes your actions (like typing a URL) and passes instructions to the browser’s internal components.

Browser Engine vs Rendering Engine

  • Browser Engine
    Acts as a coordinator. It connects the user interface with the rest of the browser. You can say that

  • Rendering Engine
    Does the actual work of converting HTML and CSS into a visible webpage.

You can think of it like this:

  • Browser Engine → Manager

  • Rendering Engine → Builder

Popular rendering engines include Blink (Chrome, Edge, Opera), WebKit (Safari, all iOS browsers), and Gecko (Firefox).

Networking: How the Browser Fetches HTML, CSS, and JS

This above picture depicts the whole explanation that is given below .

Once the browser knows where the website is, it needs to fetch files from the server.

This is handled by the networking layer, which:

  • Sends requests over the internet

  • Receives responses from servers

  • Downloads HTML, CSS, JavaScript, images, and fonts

These files arrive as raw text, not as a webpage.

But what is Parsing ?

Breaking raw text into structured meaning.

Imagine you have a complex mathematical equation: (2 + 3) * 5. To solve it, you first break it down:

  • First, understand 2, +, 3.

  • Then calculate (2 + 3) = 5.

  • Then understand *, 5.

  • Finally, calculate 5 * 5 = 25.

The browser does something similar with web code:

Initial Fetching and Parsing

  1. DNS Lookup: When a user enters a URL, the browser first determines the server's IP address using the Domain Name System (DNS), an internet "phonebook".

  2. TCP/TLS Handshake: The browser then establishes a reliable connection to the server using the TCP three-way handshake, followed by a TLS handshake for secure (HTTPS) connections to encrypt data exchange.

  3. HTTP Request and HTML Response: Once connected, the browser sends an HTTP request for the main resource, usually an HTML file. The server responds with the HTML content.

  4. HTML Parsing and DOM Construction: The browser starts parsing the incoming HTML byte stream to build the Document Object Model (DOM), a tree-like representation of the page's structure. A preload scanner runs concurrently, looking ahead in the HTML for links to other resources (like external CSS and JS files) and initiating their downloads in the background to improve performance.

HTML Parsing and DOM creation :

While parsing HTML, the browser builds something called the DOM (Document Object Model).The Rendering Engine starts reading the HTML. It breaks the HTML into understandable pieces (like words and sentences) and builds a tree-like structure called the DOM (Document Object Model).

The DOM is tree Structure , Think of the DOM like a family tree of elements.

CSS Parsing and CSSOM Creation

CSS is also parsed separately.

The browser figures out all the styling rules (colors, fonts, sizes) and builds another tree-like structure called the CSSOM (CSS Object Model).

  • CSSOM describes how elements should look

  • It includes colors, fonts, spacing, and layout rules

  • Like the DOM, it is structured and organized

At this stage:

  • DOM → structure

  • CSSOM → style

How DOM and CSSOM Come Together

The browser cannot render a page using HTML or CSS alone.

It combines:

  • DOM (what elements exist)

  • CSSOM (how they should look)

Together, they form the Render Tree.

The Render Tree includes:

  • Only visible elements

  • Final styles applied

  • Ready-to-draw components.

Layout (Reflow): Deciding Where Things Go …

Next comes layout, also called reflow. The browser calculates the exact size and position of every single element on the page relative to each other and the screen.

During layout, the browser calculates:

  • Element sizes

  • Positions on the screen

  • How elements relate to each other

This step answers questions like:

  • Where does this div appear?

  • How wide is this image?

  • How much space does this text take?

Painting and Display :

After layout is complete:

  • The browser paints pixels

  • Colors, borders, text, and images are drawn

  • The final result appears on your screen

It uses the Render Tree and the layout information to actually draw the pixels on your screen. It draws the backgrounds, text, images, borders and everything you see.

This is when the webpage becomes visible and interactive.