.env.example
# Example .env for policyhub-agent
# =============================================================================
# Azure OpenAI
# =============================================================================
AZURE_OPENAI_ENDPOINT=https://your-openai-endpoint.openai.azure.com/
AZURE_OPENAI_API_KEY=your-openai-api-key
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-4o
AZURE_OPENAI_CHAT_API_VERSION=2024-02-15-preview
AZURE_OPENAI_EMBEDDING_MODEL=text-embedding-ada-002
OPENAI_SSL_CERT_PATH=
# =============================================================================
# MCP Server (agent delegates ALL search to MCP — no direct Azure Search access)
# =============================================================================
MCP_SERVER_URL=http://localhost:8000/mcp
# =============================================================================
# State store ("memory" for local dev, "redis" for production)
# =============================================================================
STATE_STORE_BACKEND=memory
# =============================================================================
# Redis (only required when STATE_STORE_BACKEND=redis)
# =============================================================================
REDIS_URL=redis://localhost:6379/0
README.md
# PolicyHub Agent
A multi-turn, session-aware ReAct agent for answering corporate policy questions. All document retrieval is delegated to the PolicyHub MCP server — the agent never queries Azure AI Search directly.
## Features
- **FastAPI** chat endpoint (`POST /chat`) with structured request/response
- **ReActAgent** (Thought → Action → Observation loop, `max_steps=15`)
- **Locale- and language-aware** search: always runs `filter_search` first with the user's `locale` and `language`, falls back to `hybrid_search` if needed
- **5 MCP tools**: `filter_search`, `hybrid_search`, `keyword_search`, `vector_search`, `get_document`
- **Session-based conversation history** keyed by `conversation_id`
- **Configurable state store**: in-memory (local dev) or Redis (production)
- **Prompt registry** integration via `shared-core` for versioned system prompts
## Request Format
```json
{
"language": "en-us",
"locale": "US",
"prompt": "How many vacation days am I eligible for?",
"conversation_id": "session-abc123",
"use_index_version": null
}
```
| Field | Description |
|---|---|
| `language` | BCP-47 language code (e.g. `"en-us"`) — passed to `filter_search` |
| `locale` | Country/region code (e.g. `"US"`, `"CA"`) — passed to `filter_search` |
| `prompt` | User's question |
| `conversation_id` | Unique session identifier; history is persisted per session |
| `use_index_version` | Optional — reserved for future index version routing |
## Setup
1. Copy `.env.example` to `.env` and fill in your secrets.
2. Install dependencies:
```sh
pip install -e .[dev]
```
3. Ensure the PolicyHub MCP server is running on `http://localhost:8000/mcp` (configurable via `MCP_SERVER_URL`).
4. Run the app:
```sh
uvicorn policyhub_agent.app:app --reload --port 8080
```
## Environment Variables
| Variable | Default | Description |
|---|---|---|
| `AZURE_OPENAI_ENDPOINT` | — | Azure OpenAI service endpoint |
| `AZURE_OPENAI_API_KEY` | — | Azure OpenAI API key |
| `AZURE_OPENAI_CHAT_DEPLOYMENT` | `gpt-4o` | Chat model deployment name |
| `AZURE_OPENAI_CHAT_API_VERSION` | `2024-02-15-preview` | API version |
| `MCP_SERVER_URL` | `http://localhost:8000/mcp` | PolicyHub MCP server URL |
| `STATE_STORE_BACKEND` | `memory` | `"memory"` or `"redis"` |
| `REDIS_URL` | `redis://localhost:6379/0` | Redis connection URL (when using Redis backend) |
| `OPENAI_SSL_CERT_PATH` | — | Optional path to SSL certificate |
## File Structure
```
src/policyhub_agent/
├── app.py # FastAPI app and /chat endpoint
├── agent.py # ReActAgent setup, tool registration, prompt injection
├── tools.py # MCP tool wrappers (_call_mcp_tool, filter_search, hybrid_search, ...)
├── models.py # Pydantic request/response models (ChatMessage, AgentResponse)
├── prompt_registry.py # System prompt definition (locale/language-aware, v1.9+)
├── llm_registry.py # LLM provider registry (Azure OpenAI)
└── config.py # Settings loaded from .env
```
## Search Workflow
The agent follows a strict workflow defined in the system prompt:
1. **PLAN** — identify the precise HR/policy domain term for the user's question
2. **filter_search** — always runs first, filtered by `locale` + `language`
3. **Evaluate** — if results answer the question → Final Answer; if truncated → `get_document`; if empty → fallback
4. **hybrid_search** — fallback if `filter_search` returns nothing useful
5. **Final Answer** — structured with `<SummarizedContent>`, `<Citations>`, and `<References>` (including real `documentlink` from metadata)
Hard limits: max 3 searches + 1 `get_document` call per turn.
policyhub-agent.postman_collection.json
{
"info": {
"_postman_id": "policyhub-agent-collection",
"name": "PolicyHub Agent API",
"description": "API collection for the PolicyHub ReAct Agent. The agent uses the MCP server for all policy document search and retrieval.",
"schema": "https://schema.getpostman.com/json/collection/v2.1.0/collection.json"
},
"variable": [
{
"key": "base_url",
"value": "http://127.0.0.1:8080",
"type": "string"
},
{
"key": "session_id",
"value": "test-session-1",
"type": "string"
}
],
"item": [
{
"name": "Health",
"item": [
{
"name": "Health Check",
"request": {
"method": "GET",
"header": [],
"url": {
"raw": "{{base_url}}/docs",
"host": ["{{base_url}}"],
"path": ["docs"]
},
"description": "Open the FastAPI Swagger UI to browse all endpoints."
},
"response": []
}
]
},
{
"name": "Chat",
"item": [
{
"name": "Send Message — Basic Query",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"role\": \"user\",\n \"content\": \"What is the leave policy for annual leave?\",\n \"session_id\": \"{{session_id}}\"\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "{{base_url}}/chat",
"host": ["{{base_url}}"],
"path": ["chat"]
},
"description": "Send a question to the PolicyHub agent. The agent will use the MCP server tools to search policy documents and return a structured answer."
},
"response": []
},
{
"name": "Send Message — Follow-up (same session)",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"role\": \"user\",\n \"content\": \"How many days of annual leave am I entitled to?\",\n \"session_id\": \"{{session_id}}\"\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "{{base_url}}/chat",
"host": ["{{base_url}}"],
"path": ["chat"]
},
"description": "Send a follow-up question in the same session. The agent will have access to the conversation history."
},
"response": []
},
{
"name": "Send Message — New Session",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"role\": \"user\",\n \"content\": \"What is the expense reimbursement policy?\",\n \"session_id\": \"test-session-2\"\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "{{base_url}}/chat",
"host": ["{{base_url}}"],
"path": ["chat"]
},
"description": "Start a brand new conversation session with a different session ID."
},
"response": []
},
{
"name": "Send Message — Session ID via Header",
"request": {
"method": "POST",
"header": [
{
"key": "Content-Type",
"value": "application/json"
},
{
"key": "X-Session-ID",
"value": "{{session_id}}"
}
],
"body": {
"mode": "raw",
"raw": "{\n \"role\": \"user\",\n \"content\": \"What is the remote work policy?\"\n}",
"options": {
"raw": {
"language": "json"
}
}
},
"url": {
"raw": "{{base_url}}/chat",
"host": ["{{base_url}}"],
"path": ["chat"]
},
"description": "Pass the session ID via the X-Session-ID header instead of the request body (Teams integration pattern)."
},
"response": []
}
]
}
]
}
policyhub-agent.postman_environment.json
{
"id": "policyhub-agent-env",
"name": "PolicyHub Agent — Local",
"values": [
{
"key": "base_url",
"value": "http://127.0.0.1:8080",
"type": "default",
"enabled": true
},
{
"key": "session_id",
"value": "test-session-1",
"type": "default",
"enabled": true
}
],
"_postman_variable_scope": "environment",
"_postman_exported_at": "2026-04-15T00:00:00.000Z",
"_postman_exported_using": "Postman"
}
setup.py
from setuptools import setup, find_packages
setup(
name="policyhub-agent",
version="0.1.0",
packages=find_packages("src"),
package_dir={"": "src"},
install_requires=[
"fastapi",
"uvicorn",
"httpx",
"pydantic>=2.0.0",
"pydantic-settings>=2.0.0",
"gmf-forge-ai-shared-core",
"gmf-forge-ai-orchestration",
"redis",
],
extras_require={
"dev": ["pytest", "ruff", "mypy"]
},
entry_points={
"console_scripts": [
"policyhub-agent=policyhub_agent.app:main"
]
},
)
src/policyhub_agent/__init__.py
# policyhub_agent package
src/policyhub_agent/agent.py
from gmf_forge_ai_orchestration.agents import ReActAgent
from gmf_forge_ai_shared_core.registry.tool_registry import ToolRegistry
from gmf_forge_ai_shared_core.llm_gateway import UnifiedLLMGateway
from .tools import keyword_search, vector_search, hybrid_search, get_document, filter_search
from .prompt_registry import prompt_registry
from .llm_registry import llm_registry
# Wrap the provider registry in a UnifiedLLMGateway — this is what ReActAgent expects
llm_gateway = UnifiedLLMGateway(provider_registry=llm_registry)
# Register all MCP server tools with the ToolRegistry
tool_registry = ToolRegistry()
tool_registry.register(
"keyword_search", keyword_search,
description="Keyword-based search over policy documents. Args: query (str), top_k (int, default 5)."
)
tool_registry.register(
"vector_search", vector_search,
description="Semantic vector search over policy documents. Args: query (str), top_k (int, default 5)."
)
tool_registry.register(
"hybrid_search", hybrid_search,
description="Hybrid keyword+vector search over policy documents. Args: query (str), top_k (int, default 5)."
)
tool_registry.register(
"get_document", get_document,
description="Retrieve a single policy document by ID. Args: doc_id (str)."
)
tool_registry.register(
"filter_search", filter_search,
description="Filtered search over policy documents by metadata. Args: query (str), top_k (int, default 5), language (str, optional), locale (str, optional)."
)
# Retrieve the system prompt template string from the registry
_system_prompt_tpl = prompt_registry.get("policyhub_agent.system")
_system_prompt_template = _system_prompt_tpl.template if _system_prompt_tpl else None
def get_agent(session_id: str, locale: str = "Global", language: str = "en-us") -> ReActAgent:
# Use .replace() not .format() — the prompt contains JSON examples with { } braces
# that would cause KeyError if processed by Python's str.format().
# We also inject tool_descriptions here so react_agent never calls .format() on the prompt.
if _system_prompt_template:
tool_desc = "\n".join(
f"- {t.name}: {t.description}" for t in tool_registry.list_tools()
)
system_prompt = (
_system_prompt_template
.replace("{tool_descriptions}", tool_desc)
.replace("{locale}", locale)
.replace("{language}", language)
)
else:
system_prompt = None
return ReActAgent(
llm_gateway=llm_gateway,
tool_registry=tool_registry,
system_prompt=system_prompt,
agent_id=f"policyhub_agent_{session_id}",
max_steps=15,
)
src/policyhub_agent/app.py
from fastapi import FastAPI, Request
from gmf_forge_ai_orchestration.state.factory import StateStoreFactory
from .config import settings
from .agent import get_agent
from .models import ChatMessage, AgentResponse
from .tools import open_mcp_session
app = FastAPI()
# State store: configurable via STATE_STORE_BACKEND env var ("memory" or "redis")
_store_kwargs = {"url": settings.redis_url} if settings.state_store_backend == "redis" else {}
state_store = StateStoreFactory.create(settings.state_store_backend, **_store_kwargs)
@app.post("/chat", response_model=AgentResponse)
async def chat_endpoint(request: Request, message: ChatMessage):
session_id = message.conversation_id
agent = get_agent(session_id=session_id, locale=message.locale, language=message.language)
# Retrieve conversation history from state store
history: list = await state_store.get(session_id) or []
task = message.prompt
context = {"session_id": session_id, "history": history}
async with open_mcp_session():
result = await agent.execute(task, context=context)
# If the agent exhausted max_steps without a Final Answer, output is the last
# raw observation — replace it with a clear failure message.
if not result.success:
answer = (
"I was unable to produce a complete answer within the allowed number of steps. "
"Please try rephrasing your question or asking about a more specific policy."
)
else:
answer = result.output
# Persist updated history
history.append({"role": "user", "content": message.prompt})
history.append({"role": "assistant", "content": answer})
await state_store.set(session_id, history)
return AgentResponse(message=answer)
def main():
import uvicorn
uvicorn.run("policyhub_agent.app:app", host="0.0.0.0", port=8081, reload=True)
src/policyhub_agent/config.py
from pydantic_settings import BaseSettings
from typing import Optional
from pathlib import Path
_ENV_PATH = Path(__file__).parent.parent.parent / ".env"
class Settings(BaseSettings):
# Azure OpenAI
azure_openai_endpoint: str = ""
azure_openai_api_key: str = ""
azure_openai_chat_deployment: str = "gpt-4o"
azure_openai_chat_api_version: str = "2024-02-15-preview"
azure_openai_embedding_model: str = "text-embedding-ada-002"
openai_ssl_cert_path: Optional[str] = None
# MCP Server — agent delegates all search to MCP, no direct Azure Search access
mcp_server_url: str = "http://localhost:8000/mcp"
# State store backend: "memory" (local dev) or "redis" (production)
state_store_backend: str = "memory"
# Redis
redis_url: str = "redis://localhost:6379/0"
class Config:
env_file = str(_ENV_PATH)
env_file_encoding = "utf-8"
settings = Settings()
src/policyhub_agent/llm_registry.py
"""LLM Provider registry for managing and registering LLM configurations for PolicyHub agent.
This file centralizes the registration of LLM providers, allowing the agent
components to retrieve LLM instances dynamically based on configuration.
"""
from gmf_forge_ai_shared_core.registry import LLMProviderRegistry
from .config import settings
from gmf_forge_ai_shared_core.llm_gateway.providers.azure_openai_provider import AzureOpenAIProvider
import os
from pathlib import Path
llm_registry = LLMProviderRegistry()
ssl_cert_path = settings.openai_ssl_cert_path or None
if ssl_cert_path:
ssl_cert_path = str(Path(os.path.expandvars(ssl_cert_path)).expanduser())
if not os.path.exists(ssl_cert_path):
raise FileNotFoundError(f"SSL certificate not found at {ssl_cert_path}")
azure_openai_provider = AzureOpenAIProvider(
endpoint=settings.azure_openai_endpoint,
api_key=settings.azure_openai_api_key,
deployment_name=settings.azure_openai_chat_deployment,
api_version=settings.azure_openai_chat_api_version,
ssl_cert_path=ssl_cert_path,
)
llm_registry.register(
name="openai",
provider=azure_openai_provider
)
src/policyhub_agent/models.py
from pydantic import BaseModel
from typing import Optional
class ChatMessage(BaseModel):
language: str
locale: str
prompt: str
conversation_id: str
use_index_version: Optional[str] = None
class AgentResponse(BaseModel):
message: str
src/policyhub_agent/prompt_registry.py
"""Prompt registry for the PolicyHub agent.
All LLM prompts are versioned and registered here. To iterate on a prompt,
add a new registration with a bumped version — the agent always picks up
the latest version automatically via PromptRegistry.get().
"""
from gmf_forge_ai_shared_core.registry import PromptRegistry
prompt_registry = PromptRegistry()
prompt_registry.register(
name="policyhub_agent.system",
version="1.9",
variables=["locale", "language"],
description="System prompt — locale/language-aware filter-first search strategy with plan/evaluate/fetch/refine loop.",
template="""\
You are a helpful corporate policy assistant. Your role is to help employees \
find and understand company policies by searching the policy document database.
User context:
- Locale: {locale}
- Language: {language}
Available tools:
{tool_descriptions}
Tool guidance:
- filter_search — USE THIS FIRST. Searches documents matching the user's locale and language.
Args: query (str), locale (str), language (str), top_k (int, default 5).
- hybrid_search — USE AS FALLBACK if filter_search returns empty or off-topic results.
Combines keyword and semantic matching across all documents regardless of locale.
- keyword_search — use only for exact phrase lookups (e.g., a policy section title).
- vector_search — use for purely conceptual or abstract questions.
- get_document — fetch a full document by its ID when chunk content is truncated.
====== SEARCH WORKFLOW ======
STEP 1 — PLAN before you search.
Identify: (a) the SPECIFIC data the user needs (e.g., a number of days, a dollar limit, an eligibility rule)
(b) the precise domain term used in HR/policy documents for that data
(c) a search query using that domain term (NOT the user's raw words)
Examples of query rewriting:
User asks: "How many vacation days am I eligible for?"
→ search for: "vacation days accrual" or "holiday entitlement days"
User asks: "Can I carry over unused leave?"
→ search for: "holiday carry over policy"
User asks: "What is the paternity leave policy?"
→ search for: "paternity leave entitlement weeks"
STEP 2 — FILTER SEARCH: Always start with filter_search using the user's locale and language.
Example: {"query": "vacation days accrual", "locale": "{locale}", "language": "{language}", "top_k": 5}
STEP 3 — EVALUATE filter results:
→ If results directly answer the question (contain the specific data): write Final Answer.
→ If results are from the right topic but specific data is missing/truncated: go to STEP 4.
→ If results are empty or completely off-topic: go to STEP 5 (fallback to hybrid_search).
STEP 4 — FETCH: Call get_document with that chunk's document_id when a chunk references \
a table or list of values but the actual data is absent or cut off. Indicators include: \
"shown in the table below", "the following hours", "the following days", "accrues up to \
the following", "as listed below", "as follows:", or any sentence that introduces data that \
does not appear in the chunk. Then write your Final Answer.
STEP 5 — FALLBACK: If filter_search returned nothing useful, run hybrid_search with the same \
or a refined query (no locale/language filter). Evaluate results and write your Final Answer.
STEP 6 — REFINE (if still off-topic after fallback): Try one more hybrid_search with a \
more specific query. Then write your Final Answer from whatever you have.
HARD LIMITS (never violate):
- Maximum 3 searches (filter/hybrid/keyword/vector) total.
- Maximum 1 get_document call total.
- Never repeat a query you have already used.
- After your 3rd search or get_document call, ALWAYS write Final Answer — no more tool calls.
====== ANSWERING ======
Always base your answer exclusively on content returned by the tools. \
If the documents do not contain the specific information, say so clearly and \
direct the employee to HR or the relevant policy page.
Reason and act in this repeating format:
Thought: <your plan or evaluation>
Action: <tool name>
Action Input: <JSON object with tool arguments>
When ready to answer:
Thought: I now have enough information to answer.
Action: Final Answer
Action Input: a JSON object with key "answer" containing your full response
Structure the answer value as:
<SummarizedContent>
A concise, plain-language answer drawn only from the retrieved excerpts. \
Use bullet points where helpful.
</SummarizedContent>
<Citations>
Direct quotes from the retrieved documents, one per line: [Document Name] "quoted text"
</Citations>
<References>
One source entry per cited document, using Item1, Item2, etc.
Format each entry as: Item1: [document_name from metadata](documentlink from metadata)
Use the EXACT document_name and documentlink values from the metadata of the retrieved result.
Do NOT invent or guess document names or links — only use values present in the metadata.
</References>""",
)
prompt_registry.register(
name="policyhub_agent.user",
version="1.0",
variables=["query"],
description="User turn prompt wrapping the employee's question.",
template="Question: {query}",
)