🧠 Cortex Hub: AI Model Hub Service

Cortex Hub is a modular and scalable API service designed to act as a central gateway to various Large Language Models (LLMs). It features a stateful, session-based chat system with conversational memory, powered by a Retrieval-Augmented Generation (RAG) pipeline for grounding responses in your own data.

✨ Features

Conversational Memory: Engages in stateful conversations using a session-based API. The AI remembers previous messages to provide contextual follow-up answers.
Retrieval-Augmented Generation (RAG): Ingest documents into a high-speed FAISS vector store to ground the AI's responses in factual, user-provided data.
Multi-Provider Support: Easily integrates with multiple LLM providers (currently DeepSeek and Gemini).
Full Document Lifecycle: Complete API for adding, listing, and deleting documents from the knowledge base.
Modern Tech Stack: Built with FastAPI, Pydantic, SQLAlchemy, and DSPy for a robust, type-safe, and high-performance backend.
Containerized: Fully containerized with Docker and Docker Compose for easy setup and deployment.

🚀 Getting Started

You can run the entire application stack (API server and database) using Docker Compose.

Prerequisites

Docker and Docker Compose
Python 3.11+ (for local development)
An API key for at least one supported LLM (DeepSeek, Gemini).

1. Configuration

The application is configured using a .env file for secrets and a config.yaml for non-sensitive settings.

First, copy the example environment file:

cp .env.example .env

Now, open the .env file and add your secret API keys:

# .env
DEEPSEEK_API_KEY="your_deepseek_api_key_here"
GEMINI_API_KEY="your_gemini_api_key_here"

(You only need to provide a key for the model you intend to use.)

2. Running with Docker Compose (Recommended)

This is the simplest way to get the service running.

docker-compose up --build

The API server will be available at http://127.0.0.1:8000.

3. Running Locally (Alternative)

If you prefer to run without Docker:

# Install dependencies
pip install -r requirements.txt

# Run the server
uvicorn app.main:app --host 127.0.0.1 --port 8000 --reload

💬 Usage

Interactive Chat Script

The easiest way to interact with the service is by using the provided chat script. It handles starting the server, creating a session, and managing the conversation.

In your terminal, simply run:

bash run_chat.sh

You will be prompted to enter your questions in a continuous loop. Type exit to end the session and shut down the server.

API Documentation

Once the server is running, interactive API documentation (powered by Swagger UI) is automatically available at:

http://127.0.0.1:8000/docs

From this page, you can explore and execute all API endpoints directly from your browser

🧪 Running Tests

The project includes a comprehensive test suite using pytest.

Unit Tests

These tests cover individual components in isolation and use mocks for external services and the database.

pytest tests/

Integration Tests

These tests run against a live instance of the server to verify the end-to-end functionality of the API. The script handles starting and stopping the server for you.

bash run_integration_tests.sh

🏛️ Project Structure

The project follows a standard, scalable structure for modern Python applications.

.
├── app/                  # Main application package
│   ├── api/              # API layer: routes, schemas, dependencies
│   ├── core/             # Core business logic: services, pipelines, providers
│   ├── db/               # Database layer: models, session management
│   ├── app.py            # FastAPI application factory
│   └── main.py           # Application entry point
├── config.yaml           # Default configuration
├── data/                 # Persistent data (SQLite DB, FAISS index)
├── integration_tests/    # End-to-end tests
├── tests/                # Unit tests
├── Dockerfile
└── docker-compose.yml