Algorithms

HyperFusion Ranking

HyperFusion is HyperSearchX's core ranking algorithm — an 8-signal neural fusion system that combines lexical, semantic, temporal, and structural signals into a single 0–1 relevance score. It eliminates the need to manually tune ranking parameters and adapts to query intent automatically.

The 8 signals

BM25Lexical relevance

Classic Okapi BM25 TF-IDF variant tuned for web content. Scores keyword overlap between query and document with length normalization. Fast and highly reliable for exact-match queries.

SemanticEmbedding similarity

Cosine similarity between query and document embeddings. Captures meaning beyond keywords — finds relevant content even when query and document use different vocabulary.

TemporalFreshness decay

Exponential decay function that rewards recent content. Decay rate adapts to query intent: current-events queries decay fast (hours); reference queries decay slowly (years).

AuthorityDomain trust

Source trust score maintained by the Persistent Intelligence Engine (PIE). Combines domain reputation, historical accuracy, and community citation frequency.

EvidenceCross-source corroboration

Evidence Graph Builder (EGB) signal. Scores how many independent sources corroborate the same facts. Higher corroboration → higher confidence. Penalizes isolated claims.

DiversityMMR penalty

Maximal Marginal Relevance penalty applied to results too similar to already-selected results. Ensures the final result set covers multiple perspectives and sources.

DepthContent richness

Rewards comprehensive content. Signals include content length, heading structure, code block presence, table richness, citation density, and reading level.

ConsensusCommunity agreement

Social signal from Reddit upvotes, HackerNews points, GitHub stars, and StackOverflow accepted answer status. Measures whether practitioners endorse the content.

Fusion formula

The 8 signals are combined using a learned weighted sum, where weights are adjusted per query-intent class:

score = w₁·BM25 + w₂·Semantic + w₃·Temporal + w₄·Authority
+ w₅·Evidence + w₆·Diversity + w₇·Depth + w₈·Consensus
where Σwᵢ = 1.0 and weights vary by QueryIntent

Intent-based weight adaptation

Query intentDominant signalsExample
FactualBM25, Evidence, Authority"What is Rust?"
CurrentEventsTemporal, BM25, Consensus"latest Rust release"
Code / HowToBM25, Depth, Authority"how to use tokio::select"
AcademicAuthority, Evidence, Depth"transformer architecture paper"
OpinionConsensus, Diversity, Semantic"is Rust worth learning"
ComparisonDiversity, Depth, Evidence"Rust vs Go performance"

Performance characteristics

  • Ranking latency: <2ms for 50 results (pure CPU, no GPU required)
  • Signal computation is fully parallel via Rayon
  • Semantic signal computed lazily (only when BM25 is insufficient)
  • SPRE pre-ranking filters top-100 candidates before full HyperFusion scoring

Next steps