Architektura Systemu

Ten rozdzial pokazuje aktualny przeplyw danych w Jarvis po ostatnich zmianach: server-side sessions w Redis, kolejke pamieci, worker pamieci, lokalny LLM dla ekstrakcji faktow oraz opcjonalny tor glosowy przez Speaches.

Widok Komponentow

flowchart LR
    browser[Browser UI]
    web[Flask web app]
    pg[(Postgres + pgvector)]
    redis[(Redis)]
    worker[memory-worker]
    ollama[Ollama / local LLM]
    openai[OpenAI API]
    speaches[Speaches]
    chatterbox[Chatterbox TTS]
    azure[Azure Boards]
    discord[Discord]
    captcha[reCAPTCHA / MockServer]

    browser -->|HTTP, JSON, Jinja2| web
    web -->|ORM| pg
    web -->|sessions, rate limits, TTS cache, memory jobs| redis
    web -->|chat completions| openai
    web -->|speech synthesis| speaches
    web -->|alternative speech synthesis| chatterbox
    web -->|bug reports| azure
    web -->|webhook, OAuth callback| discord
    web -->|registration check| captcha

    redis -->|BRPOP jarvis:memory:jobs| worker
    worker -->|read history, write summaries and memories| pg
    worker -->|OpenAI-compatible /v1/chat/completions| ollama

Przeplyw Chatu

sequenceDiagram
    participant U as User Browser
    participant W as Flask /chat/ask
    participant DB as Postgres
    participant R as Redis
    participant O as OpenAI
    participant MW as memory-worker
    participant L as Local LLM

    U->>W: POST /chat/ask
    W->>DB: load conversation, config, recent history
    W->>DB: load ConversationSummary and MemoryItem
    W->>O: generate answer with memory context
    O-->>W: assistant response
    W->>DB: save ConversationHistory
    W->>R: LPUSH jarvis:memory:jobs
    W-->>U: response JSON

    MW->>R: BRPOP jarvis:memory:jobs
    R-->>MW: memory job
    MW->>DB: load new history and current summary
    MW->>L: extract summary and memory candidates
    L-->>MW: validated JSON payload
    MW->>DB: upsert ConversationSummary and MemoryItem

Najwazniejsza zasada: odpowiedz dla uzytkownika nie czeka na ekstrakcje pamieci. Chat zapisuje historie i wrzuca job do Redis, a worker aktualizuje pamiec w tle.

Przeplyw Glosu

flowchart TD
    ui[chat.js / speech.js]
    config[GET /speech/config]
    synth[POST /speech/synthesize]
    svc[speech_service.py]
    cache[(Redis TTS cache)]
    sp[Speaches /v1/audio/speech]
    cb[Chatterbox /v1/audio/speech]
    voices[backend/data/speaches-voices.json]
    browser[Browser speech fallback]

    ui --> config
    config --> voices
    ui -->|server TTS enabled| synth
    synth --> svc
    svc --> cache
    svc --> sp
    svc --> cb
    sp --> svc
    cb --> svc
    svc --> cache
    svc --> ui
    ui -->|server TTS disabled or unavailable| browser

Speaches jest traktowany jako lokalny OpenAI-compatible serwis glosowy. Backend waliduje model i glos przeciwko katalogowi w repozytorium, a UI automatycznie dobiera model TTS do wybranego glosu. Chatterbox jest alternatywnym providerem przez OpenAI-compatible endpoint /v1/audio/speech; mozna go uruchomic z opcjonalnym profilem Docker Compose.

Model Danych

erDiagram
    USERS ||--o{ CONVERSATION : owns
    USERS ||--o{ CONVERSATION_HISTORY : writes
    USERS ||--o{ USER_CONFIG : configures
    USERS ||--o{ DISCORD_ACCOUNTS : links
    USERS ||--o{ MEMORY_ITEMS : remembers
    USERS ||--o{ CONVERSATION_SUMMARY : summarizes
    CONVERSATION ||--o{ CONVERSATION_HISTORY : contains
    CONVERSATION ||--o{ CONVERSATION_SUMMARY : has
    CONVERSATION ||--o{ MEMORY_ITEMS : source
    CONVERSATION ||--o{ DISCORD_CONVERSATIONS : maps

    USERS {
        string id PK
        string username
        string email
        datetime created_at
    }
    CONVERSATION {
        string conversation_id PK
        string user_id FK
        string platform
        datetime created_at
    }
    CONVERSATION_HISTORY {
        int id PK
        string conversation_id FK
        string prompt
        string response
        int used_tokens
        datetime timestamp
    }
    DISCORD_CONVERSATIONS {
        int id PK
        string scope_key
        string scope_type
        string guild_id
        string channel_id
        string discord_user_id
        string conversation_id FK
    }
    CONVERSATION_SUMMARY {
        int id PK
        string conversation_id FK
        string user_id FK
        text summary
        int message_count
        datetime updated_at
    }
    MEMORY_ITEMS {
        int id PK
        string user_id FK
        string conversation_id FK
        string kind
        text content
        float salience
        text embedding_json
    }
    USER_CONFIG {
        int id PK
        string user_id FK
        string model
        float temperature
        bool speaches_enabled
        string speaches_tts_model
        string speaches_tts_voice
    }

Uwagi Operacyjne

Redis jest wspoldzielony przez sesje Flask, rate limiting, cache TTS i kolejke jarvis:memory:jobs.
memory-worker uzywa tego samego obrazu aplikacji co web, ale uruchamia modul backend.workers.memory_worker.
Domyslny lokalny extractor moze dzialac heurystycznie albo przez qwen2.5:3b-instruct za OpenAI-compatible endpointem Ollamy.
Postgres w Compose uzywa obrazu pgvector/pgvector:pg13, zeby przygotowac storage pod przyszle embeddings i vector search.
Sekrety pozostaja poza repozytorium. Pliki typu jarvis-dev.env sa lokalne i nie powinny byc kopiowane do dokumentacji ani logow.