Intent Routing

LLM-powered agentic routing of queries to the right connectors.

Intent Routing

Intent Routing is the intelligent query distribution layer. It uses LLM-powered agents to route queries to the most relevant connectors based on semantic understanding.

Overview

When a user asks "What did the team decide about authentication?", the system needs to determine:

  1. Which connectors to query (Slack, Notion, GitHub?)
  2. What document types to prioritize (messages, pages, PRs?)
  3. How to interpret the intent (information seeking, action planning?)
graph LR
    Q[Query] --> IA[Intent Agent]
    IA --> RA[Routing Agent]
    
    RA --> C1[Slack]
    RA --> C2[Notion]
    RA --> C3[GitHub]
    
    C1 --> AGG[Aggregator]
    C2 --> AGG
    C3 --> AGG
    
    AGG --> RESULTS[Ranked Results]

The 9-Agent Suite

Metalogue uses a suite of specialized agents for intelligent routing:

AgentPurpose
RoutingAgentSelects which connectors to query
IntentAgentClassifies user intent
ActionAgentPlans multi-step actions
SearchAgentExecutes semantic search
RankingAgentApplies intent ranking
HypothesisAgentManages intent lifecycle
EntityAgentExtracts named entities
TranslationAgentHandles Vec2Vec translation
ComplianceAgentChecks GDPR/policy compliance

Intent Classification

Intent Types

class IntentType(str, Enum):
    INFORMATION_SEEKING = "information_seeking"   # "What is X?"
    DECISION_SUPPORT = "decision_support"         # "What should we do about X?"
    ACTION_PLANNING = "action_planning"           # "I need to do X"
    STATUS_CHECK = "status_check"                 # "What's the status of X?"
    HISTORICAL_ANALYSIS = "historical_analysis"   # "What happened with X?"
    COMPARISON = "comparison"                     # "Compare X and Y"
    SYNTHESIS = "synthesis"                       # "Summarize X"

Example Classification

{
  "query": "What was the decision on authentication architecture last week?",
  "intent": {
    "type": "historical_analysis",
    "entities": ["authentication architecture", "last week"],
    "temporal_hint": "recent",
    "confidence": 0.94
  }
}

Routing Logic

Capability Discovery

The router leverages auto-discovered connector capabilities:

from services.connectors.capabilities import CapabilityDiscovery

# Find connectors for an intent
matches = await CapabilityDiscovery.find_for_intent(
    "What was the decision on authentication?"
)
# Returns: ["slack", "notion", "confluence"]

Routing Decision

{
  "query": "What was the decision on authentication?",
  "routing": {
    "primary_connectors": ["slack", "notion"],
    "secondary_connectors": ["github", "confluence"],
    "confidence_scores": {
      "slack": 0.92,
      "notion": 0.88,
      "github": 0.65,
      "confluence": 0.60
    },
    "reasoning": "Authentication decisions are typically discussed in Slack and documented in Notion. GitHub/Confluence may have implementation details."
  }
}

Intent Ranking

Results are ranked using 5D intent vectors:

The 5 Dimensions

@dataclass
class PsychographicVector:
    urgency: float           # 0-1: Time sensitivity
    emotional_weight: float  # 0-1: Emotional vs rational
    action_likelihood: float # 0-1: Browsing vs acting
    social_dimension: float  # 0-1: Solo vs collaborative
    context_sensitivity: float # 0-1: Context dependence

Ranking Formula

scorefinal=αsim+βϕ+γτscore_{final} = \alpha \cdot sim + \beta \cdot \phi + \gamma \cdot \tau

Where:

  • simsim = semantic similarity
  • ϕ\phi = intent alignment
  • τ\tau = temporal relevance
  • α,β,γ\alpha, \beta, \gamma = learned weights

Hypothesis State Machine

The system tracks intent evolution over time:

stateDiagram-v2
    [*] --> Transient
    Transient --> Probabilistic: θ₁ (0.4)
    Probabilistic --> Validated: θ₂ (0.7)
    Validated --> Actionable: θ₃ (0.85)
    Actionable --> Collapsed: Completed
    Transient --> Collapsed: Expired

States

StateDescriptionThreshold
TransientInitial signal, may be noiseEntry
ProbabilisticReinforced by patternsθ₁ = 0.4
ValidatedConfirmed true intentθ₂ = 0.7
ActionableReady to surfaceθ₃ = 0.85
CollapsedCompleted or expiredExit

Use Case

User: "quarterly report" → Transient
User: Opens Q3 folder → Probabilistic (0.5)
User: Searches "Q3 revenue" → Validated (0.75)
System: "Would you like me to compile the Q3 report?" → Actionable (0.9)

Entropy Monitoring

The system tracks cognitive load to prevent information overload:

Shannon Entropy

H=ipilog2piH = -\sum_{i} p_i \log_2 p_i

Where pip_i is the probability of query pattern ii.

Entropy Levels

LevelScoreBehavior
Very Low< 0.5Surface everything
Low0.5-1.0Be permissive
Normal1.0-2.0Be selective
High2.0-3.0Filter aggressively
Very High3.0-4.0Critical items only
Critical> 4.0Damping active

Homeostasis Damping

When entropy exceeds threshold (default: 2.0):

  1. Reduce result surfacing globally
  2. Cross-node coordination
  3. Prioritize critical items only
  4. Release when entropy drops below 1.6

API Endpoints

Route Query

POST /v1/agentic/route
Content-Type: application/json

{
  "query": "What was the decision on authentication?",
  "user_id": "user-uuid"
}

Response:

{
  "connectors": ["slack", "notion"],
  "confidence_scores": {
    "slack": 0.92,
    "notion": 0.88
  },
  "intent": "historical_analysis",
  "reasoning": "Authentication decisions typically in Slack discussions..."
}

Discover Intent

POST /v1/agentic/discover
Content-Type: application/json

{
  "query": "What was the decision on authentication?",
  "context": {
    "recent_queries": ["auth service", "login flow"],
    "open_documents": ["notion:auth-design"]
  }
}

Get Agent Metrics

GET /v1/agentic/metrics

Response:

{
  "agents": {
    "intent_agent": {
      "invocations": 1234,
      "avg_latency_ms": 45,
      "success_rate": 0.98
    },
    "routing_agent": {
      "invocations": 1234,
      "avg_latency_ms": 32,
      "success_rate": 0.99
    }
  },
  "total_queries_routed": 45678,
  "avg_connectors_per_query": 2.3
}

Configuration

# Hypothesis State Machine
HYPOTHESIS_THETA_1=0.4
HYPOTHESIS_THETA_2=0.7
HYPOTHESIS_THETA_3=0.85
DECAY_LAMBDA_DEFAULT=0.05

# Entropy
ENTROPY_DAMPING_THRESHOLD=2.0
ENTROPY_WINDOW_HOURS=2.0

# Agents
AGENT_DEFAULT_MODEL=gpt-4o-mini
AGENT_RATE_LIMIT=30
AGENT_CACHE_TTL=300

Best Practices

  1. Provide context - Recent queries help intent classification
  2. Monitor entropy - High entropy indicates information overload
  3. Trust the routing - The system learns from usage patterns
  4. Check agent metrics - Monitor success rates and latency

Next Steps