Catalyst - Functional Requirements Document
Catalyst - OpenELIS Lab Data Assistant
Functional Requirements Document
Version: 1.0
Date: November 2025
Status: Draft
Jira Epic: OGC-70
1. Executive Summary
Catalyst is an LLM-powered data assistant for OpenELIS Global that enables users to generate custom data extracts, reports, and dashboard widgets using natural language prompts. The feature provides a privacy-first architecture where the LLM only accesses database schema (never patient data), with all data processing occurring locally. Catalyst aims to replace the existing Jasper Reports and PSQL extract infrastructure with a more flexible, user-friendly reporting framework.
1.1 Target Users
Lab Managers - Operational reporting, workload analysis, performance metrics
Case Managers - Program-specific follow-up reports (HIV, TB, etc.)
QA Officers - Quality indicators, turnaround times, rejection rates
1.2 Core Capabilities
Natural language report/extract generation via LLM
Custom dashboard widgets with drag-and-drop layout
Scheduled recurring reports with configurable delivery
Shareable reports and snapshots across users and lab units
Fallback query builder UI for non-LLM operation
Migration path for existing preset reports
2. User Interface Components
2.1 Catalyst Sidebar Launcher
Location: Microscope icon in the top navigation menu (persistent across all screens)
Behavior:
Clicking the icon opens a fixed-width sidebar on the right side of the screen
Sidebar appears alongside current content without obscuring it
Collapsible via icon click or collapse button
System remembers open/closed state across navigation
Automatically pulls context from the current screen when launched (e.g., patient record, sample, program)
2.2 Chat Interface (Default Mode)
Layout:
Conversation-style interface with message history
User prompt input at bottom
Results display area with preview functionality
"Show Reasoning" expandable section for verbose LLM output
Prompt Management:
Previous prompts saved automatically
Users can star, rename, delete, or add prompts to collections
Searchable prompt history
2.3 Wizard Mode (Structured Alternative)
Purpose: Guide users through report creation with structured questions
Wizard Steps:
Output Type Selection - Report, widget, graph, or data extract
Output Format - CSV, PDF, Excel, JSON, HL7/FHIR, custom delimited
Data Elements - Field selection with friendly names
Filters & Parameters - Date ranges, program filters, custom criteria
Scheduling - One-time or recurring (daily/weekly/monthly/cron)
Delivery Options - Archive location, notifications, export endpoint
Sharing Configuration - Personal, lab unit, or all users
Toggle: Users can switch between chat and wizard mode at any time
2.4 My Reports Section
Location: First item within the "Reports" menu
Views:
List view (default)
Search by name and description
Sort options: name, last run, created date, next scheduled run
Filter by: owned, shared with me, presets, scheduled, tags
Organization:
Personal reports
Shared with me (with option to hide individual items)
Preset reports (system-provided)
Archive (completed scheduled reports)
2.5 My Dashboard Section
Location: New dynamic menu item "My Dashboard"
Capabilities:
Select which widgets/dashboard views to display
Drag-and-drop widget arrangement
Performance-managed widget limits
Manual refresh button per widget
Notification option when refresh completes
3. LLM Integration
3.1 Architecture
Privacy Model:
LLM receives database schema and metadata only
No patient data, sample data, or PHI transmitted to LLM
All query execution occurs locally against the database
Deployment Options (Admin Configurable):
Local - LLM runs on the OpenELIS server (scaled-back model)
Central - LLM hosted on a more powerful server (e.g., Ministry of Health infrastructure)
Cloud API - Optional hooks to external LLM providers (OpenAI, Anthropic, etc.)
3.2 Schema Exposure
Curated "safe" subset of tables/views exposed by default
Friendly field name mappings (e.g.,
pat_dob→ "Patient Date of Birth")Pre-built relationships and joins that the LLM understands
Admin can restrict specific tables/fields from being queryable
3.3 Interaction Flow
User types natural language prompt
LLM generates SQL query based on schema knowledge
System runs EXPLAIN to estimate complexity/row count
User previews results (paginated if large)
User refines via follow-up prompts
User saves report/extract/widget
User reruns from My Reports menu
User can return to Catalyst to refine/modify
3.4 Guardrails
Guardrail | Implementation |
|---|---|
Query complexity limits | Warn if joins exceed threshold or no date/filter constraints on large tables |
Row count estimation | Run EXPLAIN or COUNT preview before full execution |
Prompt injection protection | Sanitize inputs, restrict to SELECT statements only |
Timeout thresholds | Kill long-running queries with user notification |
Required filters | Mandate date ranges or program filters for certain data types |
Broad query confirmation | "This will query all patients since 2015. Add a date filter?" |
Audit logging | Track all queries, who ran them, when, parameters used |
3.5 Error Handling
Plain language explanation of what went wrong
Suggestions for how to fix the issue
"Ask the assistant for help" option with error context
Manual query editing as escape hatch
"Report this as a bad suggestion" feedback mechanism
Retry logic for transient failures
3.6 Fallback Mode (Query Builder UI)
Available when: LLM is unavailable or disabled by admin
Components:
Table/field selector with search functionality
Filter builder (field + operator + value)
Grouping and aggregation options
Abstracted join configuration (user-friendly, not raw SQL)
Sort order selection
Templates for common patterns:
Tests by date range
Turnaround time summary
Rejection rate by reason
Sample volume by test type
4. Report & Extract Management
4.1 Supported Output Types
Type | Description | Default Format |
|---|---|---|
Data Extract | Raw data export for analysis or external systems | CSV |
Report | Formatted document with headers, summaries | |
Widget | Dashboard visualization component | In-app display |
Graph/Chart | Standalone visualization | PNG/SVG or in-app |
4.2 Export Formats
CSV - Default for data extracts
Excel (.xlsx) - Formatted spreadsheets with multiple sheets
PDF - Formatted reports with customizable headers
JSON - Structured data for system integration
HL7/FHIR Bundle - Healthcare interoperability standard
Custom Delimited - User-specified delimiter
Format Selection: User-selectable per report with intelligent defaults based on output type
4.3 PDF Customization
Header/footer configuration using existing admin-defined logos and facility info
Option for default header or custom combination of elements
Page orientation selection (portrait/landscape)
Charts and visualizations embedded in PDF
Print optimization options:
High contrast mode
Black & white optimized
Ink/paper efficient layouts
4.4 Data Extract Configuration (for External Systems)
Field mapping (rename columns, reorder fields)
Data transformation (date formats, code mappings)
Header row inclusion toggle
Schema change alerts ("contract" monitoring for downstream systems)
Incremental/delta export support (only records changed since last run)
4.5 Metadata Stored Per Report
Name
Description
Created date
Last run timestamp
Schedule configuration
Owner (user ID)
Shared with (users, lab units)
Tags/categories
Version history
Associated parameters
4.6 Version History
Automatic Tracking:
What changed (query, configuration, visualization)
Who made the change
When the change occurred
Retain 10 most recent versions automatically
Manual Snapshots:
Users can create named snapshots as rollback points
Snapshots retained indefinitely until deleted
Rollback:
One-click rollback to any saved version
Side-by-side comparison: Future Enhancement
5. Scheduling & Delivery
5.1 Frequency Options
Daily
Weekly (specify day)
Monthly (specify date)
Custom cron expression
5.2 Quiet Hours
Scheduled reports pause during configured hours
Default: 7:00 AM - 6:00 PM local time
Admin configurable per implementation
5.3 Delivery Options
Option | Description |
|---|---|
In-app notification | Alert when report completes (if enabled) |
User Archive | Save to My Reports > Archive |
Export endpoint | Admin-configured location for system integration |
5.4 Archive Management
Reports saved to user's personal Archive
Global quota: Admin-configurable, initial cap at 10 GB
Per-user quota: Admin-configurable
Archive accessible from My Reports section
Archived reports can be viewed, copied, shared, or deleted
5.5 Resource Management
Heavy operations can be interrupted with user confirmation
Option to schedule heavy reports for off-peak execution
Throttling to prevent system performance impact
Staggered execution for multiple scheduled reports
6. Dashboard Widgets
6.1 Supported Widget Types
Tables (paginated data grids)
Bar charts
Line charts (trends over time)
Pie charts
KPI cards (single metric display)
Not in Scope: Maps (future consideration)
6.2 Visualization Selection
LLM suggests appropriate visualization based on data shape
User can override or change visualization type
Preview before saving
6.3 Layout & Performance
Drag-and-drop widget arrangement
Maximum widget count per dashboard (admin configurable)
Staggered loading to prevent simultaneous heavy queries
6.4 Refresh Behavior
Default interval: 24 hours
Configurable: Per-widget refresh interval
Manual refresh: Button per widget
Notification: Optional alert when refresh completes
6.5 Interactivity (Future Enhancement if Resource-Efficient)
Click chart segment to drill down
Click KPI card to see underlying data
Cross-filtering between widgets
6.6 Conditional Formatting & Alerts
Configure thresholds per widget
Visual indicators when thresholds breached:
Color changes (red/yellow/green)
Warning icons
Integration with built-in OpenELIS alerts system
Alert conditions:
Threshold breach (value above/below X)
Trend detection (X% change from previous period)
Missing data (expected results not returned)
7. Sharing & Permissions
7.1 Role-Based Access Control
Catalyst respects existing OpenELIS role-based permissions:
Users can only query data they have permission to view
Case managers restricted to their program's patients
Lab managers may have broader access
Queries for unauthorized data return permission error with admin contact suggestion
7.2 Report Sharing Levels
Level | Who Can Share | Audience |
|---|---|---|
Personal | Any user | Self only (default) |
Lab Unit | Users with reports privileges for that unit | All users in that lab unit |
All Users | Administrators only | Entire organization |
7.3 Shared Reports Behavior
Shared reports appear in recipient's "Shared with Me" section
Recipients can hide shared reports they don't need
Shared reports show live data from current database (not snapshots)
Recipients can duplicate and modify their own copy
7.4 Snapshot Sharing
Specific data snapshots can be shared (point-in-time data)
Snapshots include clear date/time stamp
Use case: Share specific findings without ongoing access
Snapshot sharing logged in audit trail
7.5 Preset Reports
System-provided reports (~10 migrated from Jasper/PSQL)
Users can view and run presets
Users can duplicate and modify copies
Only administrators can edit original presets
Tagged with titles and descriptions
Configurable per implementation (not all presets required)
8. Administration
8.1 Configuration Settings
Setting | Description | Default |
|---|---|---|
LLM Connection | Local, Central, or Cloud API | Local |
LLM Model Selection | Choose from available models | Implementation-specific |
Token Limits | Max tokens per query | Admin-defined |
Rate Limiting | Queries per user per time period | Admin-defined |
Global Archive Quota | Total storage for all archives | 10 GB |
Per-User Archive Quota | Storage limit per user |