Algorithms

Algorithms Overview

HyperSearchX implements 17 novel algorithms that don't exist in any other search tool. Each algorithm is purpose-built to solve a specific aspect of the intelligent search pipeline.

HyperFusion

Ranking
8-signal neural ranking

Combines BM25 keyword relevance, semantic similarity, temporal freshness, domain authority, evidence strength, diversity penalty, content depth, and source consensus into a single 0–1 score.

CEP

Extraction
Content Extraction Protocol — 5-layer cascade

CSS selectors → Readability → Headless JS → PDF extraction → Screenshot OCR. Each layer is tried in order, falling back gracefully to handle any web content.

QATBE

Tokens
Query-Aware Token-Budgeted Extraction

BM25-scored segment ranking combined with a greedy knapsack algorithm to pack maximum relevance within a token budget. Powers all detail tiers.

SCS

Extraction
Semantic Content Segmentation

Classifies content into 8 typed segments: heading, paragraph, code, list, table, quote, metadata, other. Each type has a different token efficiency weight.

ABS

Search
Adaptive Backend Selector

Analyzes query intent, complexity, and freshness requirements to automatically select the optimal combination of search backends for each request.

AMRS

Research
Adaptive Multi-Agent Research Swarm

Spawns parallel Tokio agents (planner, searcher, extractor, synthesizer) to process complex research queries with inter-agent coordination via async channels.

PIE

Intelligence
Persistent Intelligence Engine

Cross-session learning via SQLite. Tracks source trust scores, failure patterns, and query prediction to improve results over time.

QFD

Cache
Query Fingerprinting & Deduplication

SimHash-based fingerprinting identifies semantically equivalent queries and routes to cached results, reducing redundant backend calls.

Algorithm pipeline

Algorithms execute in a defined pipeline order for each request type:

Search pipeline
1QFD → Cache lookup (skip if miss)
2QCE → Query complexity analysis
3QXE → Query expansion
4CLQB → Cross-lingual query building
5ABS → Backend selection
6LP → Latency prediction
7SPRE → Speculative pre-ranking
8Multi-backend parallel fetch
9CEP → Content extraction per source
10SCS → Semantic segmentation
11QATBE → Token-budgeted extraction
12HyperFusion → 8-signal neural ranking
13RCE → Result clustering + dedup
14RDO → MMR diversity optimization
15SSE → Smart snippet generation
16EGB → Evidence graph building
17PIE → Intelligence update

Next steps