# Code Review Report: Feature 7 & 8 — RAG Infrastructure & Voice Services

This report performs a deep-dive audit of the Vector Search and Audio Synthesis layers, focusing on `faiss_store.py` and `gemini.py` (TTS) through the lens of **12-Factor App Methodology**, **Pythonic Code Style**, and **Concurrency Safety**.

---

## 🏗️ 12-Factor App Compliance Audit

| Factor | Status | Observation |
| :--- | :--- | :--- |
| **VI. Processes** | 🔴 **Major Issue** | **Simultaneous Write Hazard (FAISS)**: The `FaissVectorStore` (Lines 69, 106) performs a synchronous `self.save_index()` on every document ingestion. Because `faiss.write_index` performs a full file overwrite, two concurrent RAG sessions adding documents simultaneously will enter a race condition, leading to **permanent FAISS index corruption**. This index should be managed by a singleton manager or write-ahead-logging (WAL) pattern. |
| **XI. Logs** | 🔴 **Security Warning** | **Credential Leak Potential**: The `GeminiTTSProvider` (Line 74) logs its API endpoint URL. For AI Studio keys, the `api_key` is **part of the URL**. While currently truncated for debugging, any change in log level or endpoint format risks exposing production API keys in plain-text logs. |
| **IX. Disposability** | 🟡 **Warning** | **In-Memory Audio Bloat**: `generate_speech` accumulates the entire audio result in-memory (`b"".join(audio_fragments)`) before returning. For long-form text synthesis, this can cause significant Hub memory pressure and long "Time-To-First-Byte" (TTFB) for the UI. |

---

## 🔍 File-by-File Diagnostic

### 1. `app/core/vector_store/faiss_store.py`
The local-first vector search engine using FAISS and SQLAlchemy.

**Identified Problems**:
*   **Stale ID Map**: `initialize_index` (Line 25) syncs with the DB on startup, but there's no mechanism to handle out-of-sync states if the DB is rolled back but the FAISS file is already written.
*   **Search Inefficiency**: `search_similar_documents` performs a three-stage query (FAISS search $\rightarrow$ Filter $\rightarrow$ ID Lookup). This introduces unnecessary overhead for small result sets.

---

### 2. `app/core/providers/tts/gemini.py`
The Google Gemini/Vertex AI audio synthesis provider.

> [!CAUTION]
> **Lack of Stream Consumption Support**
> The provider is structured as an "All-or-Nothing" buffer (Line 151). This prevents streaming playback on the frontend, which is the standard for modern "agentic" voice interactions.
> **Fix**: Update the `generate_speech` method to be an `async generator` that yields audio chunks as they arrive from the Google stream.

**Identified Problems**:
*   **Vertex Region Lock**: The Vertex endpoint is hardcoded to `us-central1` (Line 58). This violates the requirement for data residency and configurable regions.
*   **Magic Number Model Name**: Line 42 hardcodes a preview model name ("gemini-2.5-flash-preview-tts"). If this model is deprecated by Google, the Voice feature will break for all users until a code change is deployed.

---

## 🛠️ Summary Recommendations

1.  **Harden FAISS Synchronization**: Implement a lock or specialized "Vector Writer" task to serialize `save_index` calls and prevent index corruption during concurrent ingestion.
2.  **Sanitize Logging**: Remove API URLs from standard `debug` logs in the TTS/STT providers. Use masked/redacted strings for sensitive metadata.
3.  **Implement Streaming TTS**: Refactor the TTS interface to support chunked delivery, reducing TTFB and Hub memory usage.

---

**This concludes Feature 7 & 8. I have persisted these reports to `/app/docs/reviews/`. I am ready for your next request.**