Skip to content

Module 01 — Web Crawler & Discovery

Icon: 🔍 Magnifying Glass   |   Colour: Blue

Overview

The crawler is the first module to run and forms the foundation for all subsequent tests. It discovers pages, endpoints, forms, API routes, and JavaScript files across the target application.

How It Works

  1. Well-known path probing — checks approximately 40 common paths (e.g. /robots.txt, /sitemap.xml, /.env, /wp-admin, /api/docs).
  2. Breadth-first crawl — follows links from the start URL up to the configured depth and page limits.
  3. Dual-mode fetching — crawls both anonymously and with the provided Bearer token (if supplied) to discover authenticated-only content.
  4. Form & link extraction — identifies all HTML forms, anchor links, and script tags.
  5. OPTIONS requests — probes discovered endpoints to check for available HTTP methods.
  6. JavaScript analysis — extracts API endpoint patterns from JavaScript files (e.g. fetch('/api/...')).

Expected Findings

Finding Severity
Crawl Summary Info
Sensitive file discovered Medium
Sensitive HTML comment Low

Tips

Tip

Providing a Bearer token when creating the scan significantly improves crawl coverage on applications with authenticated areas.

Tip

Use Skip Paths to prevent the crawler from visiting URLs that might cause side effects (e.g. /logout, /delete-account).