--- url: 'https://docs.snapotter.com/api/ai.md' --- # AI Engine Reference The `@snapotter/ai` package bridges Node.js to a **persistent Python sidecar** for all ML operations. The dispatcher process stays alive between requests for fast warm-start performance. GPU is auto-detected at startup and used when available. 14 AI tool routes. All models run locally - no internet required after initial model download. ## Architecture ``` Node.js Tool Route │ ▼ @snapotter/ai bridge.ts │ (stdin/stdout JSON + stderr progress events) ▼ Python dispatcher (persistent process) │ ├─ remove_bg.py (rembg / BiRefNet) ├─ upscale.py (RealESRGAN) ├─ inpaint.py (LaMa ONNX) ├─ ocr.py (PaddleOCR / Tesseract) ├─ detect_faces.py (MediaPipe) ├─ face_landmarks.py (MediaPipe landmarks) ├─ enhance_faces.py (GFPGAN / CodeFormer) ├─ colorize.py (DDColor) ├─ noise_removal.py (tiered denoising) ├─ red_eye_removal.py (landmark + color analysis) ├─ restore.py (scratch repair + enhancement + denoising) └─ seam_carving (Go caire binary - not Python) ``` **Timeouts:** 300 s default; OCR and BiRefNet background removal get 600 s. ## Background Removal **Function:** `removeBackground`\ **Tool route:** `remove-background`\ **Model:** rembg with BiRefNet (default) or U2-Net variants | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `model` | string | `birefnet-general` | Model variant - see table below | | `alphaMattingForeground` | number (1–255) | 240 | Foreground threshold for alpha matting | | `alphaMattingBackground` | number (1–255) | 10 | Background threshold for alpha matting | | `returnMask` | boolean | false | Return the mask instead of the cutout | | `backgroundColor` | string | - | Fill removed area (hex color or "transparent") | **Available models:** | Model ID | Best for | |----------|---------| | `birefnet-general` | General purpose (default) | | `birefnet-portrait` | People / portraits | | `birefnet-dis` | Dichotomous Image Segmentation | | `birefnet-hrsod` | High-resolution salient objects | | `birefnet-cod` | Camouflaged objects | | `u2net` | Fast general purpose | | `u2net_human_seg` | Human segmentation | | `isnet-general-use` | High quality general | ## Image Upscaling **Function:** `upscale`\ **Tool route:** `upscale`\ **Model:** RealESRGAN (with Lanczos fallback on CPU-constrained systems) | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `scale` | 2 | 4 | 4 | Upscale factor | | `model` | string | `realesrgan-x4plus` | Model variant | | `faceEnhance` | boolean | false | Apply GFPGAN face enhancement pass | | `denoise` | number (0–1) | 0.5 | Denoising strength | | `format` | string | - | Output format override | | `quality` | number | 95 | Output quality (for JPEG/WebP) | ## OCR / Text Extraction **Function:** `extractText`\ **Tool route:** `ocr`\ **Models:** Tesseract (fast), PaddleOCR PP-OCRv5 (balanced), PaddleOCR-VL 1.5 (best) | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `quality` | `fast` | `balanced` | `best` | `balanced` | Processing tier | | `language` | string | `en` | Language code (ISO 639-1) | | `enhance` | boolean | false | Pre-process image to improve OCR accuracy | Returns structured results with bounding boxes, confidence scores, and extracted text blocks. ## Face / PII Blur **Function:** `blurFaces`\ **Tool route:** `blur-faces`\ **Model:** MediaPipe face detection | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `blurRadius` | number | 30 | Gaussian blur radius | | `sensitivity` | number (0–1) | 0.5 | Detection confidence threshold | ## Face Enhancement **Function:** `enhanceFaces`\ **Tool route:** `enhance-faces`\ **Models:** GFPGAN, CodeFormer | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `model` | `gfpgan` | `codeformer` | `gfpgan` | Enhancement model | | `strength` | number (0–1) | 0.7 | Enhancement strength | | `sensitivity` | number (0–1) | 0.5 | Face detection threshold | | `centerFace` | boolean | false | Focus enhancement on center face only | ## AI Colorization **Function:** `colorize`\ **Tool route:** `colorize`\ **Model:** DDColor (with OpenCV DNN fallback) Converts black-and-white or grayscale photos to full color. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `intensity` | number (0–1) | 0.85 | Color saturation strength | | `model` | string | `ddcolor` | Model variant | ## Noise Removal **Function:** `noiseRemoval`\ **Tool route:** `noise-removal` Three-tier denoising pipeline (fast: OpenCV bilateral filter; balanced: frequency-domain; best: deep learning model). | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `quality` | `fast` | `balanced` | `best` | `balanced` | Processing tier | | `strength` | number (0–1) | 0.5 | Denoising strength | | `preserveDetail` | boolean | true | Edge-preserving mode | | `colorNoise` | boolean | false | Target color noise specifically | ## Red Eye Removal **Function:** `removeRedEye`\ **Tool route:** `red-eye-removal` Detects face landmarks, locates eye regions, and corrects red-channel oversaturation. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `sensitivity` | number (0–1) | 0.5 | Red pixel detection threshold | | `strength` | number (0–1) | 0.9 | Correction strength | ## Photo Restoration **Function:** `restorePhoto`\ **Tool route:** `restore-photo` Multi-step pipeline for old or damaged photos: scratch/tear detection and repair → face enhancement → denoising → optional colorization. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `mode` | `auto` | `light` | `heavy` | `auto` | Restoration intensity | | `scratchRemoval` | boolean | true | Detect and repair scratches, tears | | `faceEnhancement` | boolean | true | Apply face enhancement pass | | `fidelity` | number (0–1) | 0.7 | Face enhancement strength | | `denoise` | boolean | true | Apply denoising pass | | `denoiseStrength` | number (0–100) | 40 | Denoising strength | | `colorize` | boolean | false | Colorize after restoration | ## Passport Photo **Function:** Uses `detectFaceLandmarks` + `removeBackground`\ **Tool route:** `passport-photo`\ **Model:** MediaPipe face landmarks Generates government-compliant ID photos. Supports **37 countries** across 6 regions (Americas, Europe, Asia, Africa, Oceania, Middle East). Each spec includes physical dimensions, DPI, head-height ratio, eye-line position, and background color requirements. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `country` | string | `us` | ISO country code (see list in UI) | | `printLayout` | `4x6` | `A4` | `none` | `none` | Output as print sheet or standalone | | `backgroundColor` | string | country default | Background fill color | ## Object Erasing (Inpainting) **Function:** `inpaint`\ **Tool route:** `erase-object`\ **Model:** LaMa via ONNX Runtime | Parameter | Type | Required | Description | |-----------|------|---------|-------------| | `maskData` | string | Yes | Base64-encoded PNG mask (white = erase) | | `maskThreshold` | number (0–255) | No | Threshold for mask binarization | GPU-accelerated when an NVIDIA GPU is available. ## Smart Crop **Function:** Uses MediaPipe + Sharp attention/entropy\ **Tool route:** `smart-crop`\ **Model:** MediaPipe face detection | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `mode` | `subject` | `face` | `trim` | `subject` | Crop strategy | | `width` | number | - | Output width | | `height` | number | - | Output height | | `facePreset` | string | - | Preset framing when `mode=face` | **Face presets:** | Preset | Head ratio | Best for | |--------|-----------|---------| | `close-up` | 1.8× face | Headshots | | `head-and-shoulders` | 2.8× face | Profile photos | | `upper-body` | 4.5× face | LinkedIn / formal | | `half-body` | 7.0× face | Full upper body | ## Image Enhancement **Function:** `analyzeImage` + `applyCorrections`\ **Tool route:** `image-enhancement`\ **Engine:** Analysis-based (Sharp histogram and statistics) Analyzes the image and applies automatic corrections for exposure, contrast, white balance, saturation, sharpness, and noise. Supports scene-specific modes. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `mode` | `auto` | `portrait` | `landscape` | `low-light` | `food` | `document` | `auto` | Scene mode for tuning corrections | | `intensity` | number (0-100) | 50 | Overall correction strength | | `corrections.exposure` | boolean | true | Apply exposure correction | | `corrections.contrast` | boolean | true | Apply contrast correction | | `corrections.whiteBalance` | boolean | true | Apply white balance correction | | `corrections.saturation` | boolean | true | Apply saturation correction | | `corrections.sharpness` | boolean | true | Apply sharpness correction | | `corrections.denoise` | boolean | true | Apply denoising | An additional analysis endpoint is available at `POST /api/v1/tools/image-enhancement/analyze` which returns the detected corrections without applying them. ## Content-Aware Resize (Seam Carving) **Function:** `seamCarve`\ **Tool route:** `content-aware-resize`\ **Engine:** Go `caire` binary (not Python - no GPU benefit) Intelligently resizes images by removing or adding low-energy seams, preserving important content. | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `width` | number | - | Target width | | `height` | number | - | Target height | | `protectFaces` | boolean | true | Protect detected face regions from seam removal | | `blurRadius` | number | 0 | Pre-blur to reduce noise sensitivity | | `sobelThreshold` | number | 10 | Edge sensitivity threshold | | `square` | boolean | false | Force square output | Max input edge before auto-downscaling: **1200 px**. --- --- url: 'https://docs.snapotter.com/guide/architecture.md' --- # Architecture SnapOtter is a monorepo managed with pnpm workspaces and Turborepo. Everything ships as a single Docker container. ## Project structure ``` snapotter/ ├── apps/ │ ├── api/ # Fastify backend │ ├── web/ # React + Vite frontend │ └── docs/ # This VitePress site ├── packages/ │ ├── image-engine/ # Sharp-based image operations │ ├── ai/ # Python AI model bridge │ └── shared/ # Types, constants, i18n └── docker/ # Dockerfile and Compose config ``` ## Packages ### `@snapotter/image-engine` The core image processing library built on [Sharp](https://sharp.pixelplumbing.com/). It handles all non-AI operations: resize, crop, rotate, flip, convert, compress, strip metadata, and color adjustments (brightness, contrast, saturation, grayscale, sepia, invert, color channels). This package has no network dependencies and runs entirely in-process. ### `@snapotter/ai` A bridge layer that calls Python scripts for ML operations. On first use, the bridge starts a persistent Python dispatcher process that pre-imports heavy libraries (PIL, NumPy, MediaPipe, rembg) so subsequent AI calls skip the import overhead. If the dispatcher is not yet ready, the bridge falls back to spawning a fresh Python subprocess per request. **Models are not pre-loaded.** Each tool script loads its model weights from disk at request time and discards them when the request finishes. See [Resource footprint](#resource-footprint) for the full memory profile. Supported operations: background removal (rembg/BiRefNet), upscaling (RealESRGAN), face blur (MediaPipe), face enhancement (GFPGAN/CodeFormer), object erasing (LaMa ONNX), OCR (PaddleOCR/Tesseract), colorization (DDColor), noise removal, red eye removal, photo restoration, passport photo generation, and content-aware resize (Go caire binary). Python scripts live in `packages/ai/python/`. The Docker image pre-downloads all model weights during the build so the container works fully offline. ### `@snapotter/shared` Shared TypeScript types, constants (like `APP_VERSION` and tool definitions), and i18n translation strings used by both the frontend and backend. ## Applications ### API (`apps/api`) A Fastify v5 server exposing 47 tool routes (33 standard image operations + 14 AI-powered) that handles: * File uploads, temporary workspace management, and persistent file storage * User file library with version chains (`user_files` table) -- each processed result links back to its source file and records which tool was applied, with auto-generated thumbnails for the Files page * Tool execution (routes each tool request to the image engine or AI bridge) * Pipeline orchestration (chaining multiple tools sequentially) * Batch processing with concurrency control via p-queue * User authentication, RBAC (admin/user roles with a full permission set), API key management, and rate limiting * Teams management -- admin-only CRUD; users are assigned to a team via the `team` field on their profile * Runtime settings -- a key-value store in the `settings` table that controls `disabledTools`, `enableExperimentalTools`, `loginAttemptLimit`, and other operational knobs without redeploying * Custom branding -- logo upload endpoint; the uploaded image is stored at `data/branding/logo.png` and served to the frontend * Swagger/OpenAPI documentation at `/api/docs` * Serving the built frontend as a SPA in production Key dependencies: Fastify, Drizzle ORM, better-sqlite3, Sharp, Piscina (worker thread pool), Zod for validation. The server handles graceful shutdown on SIGTERM/SIGINT: it drains HTTP connections, stops the worker pool, shuts down the Python dispatcher, and closes the database. ### Web (`apps/web`) A React 19 single-page app built with Vite. Uses Zustand for state management, Tailwind CSS v4 for styling, and Lucide for icons. Communicates with the API over REST and SSE (for progress tracking). Pages include a tool workspace, a Files page for managing persistent uploads and results, an automation/pipeline builder, and an admin settings panel. The built frontend gets served by the Fastify backend in production, so there is no separate web server in the Docker container. ### Docs (`apps/docs`) This VitePress site. Deployed to GitHub Pages automatically on push to `main`. ## How a request flows 1. The user picks a tool in the web UI and uploads an image. 2. The frontend sends a multipart POST to `/api/v1/tools/:toolId` with the file and settings. 3. The API route validates the input with Zod, then dispatches processing. 4. For standard tools, the request is offloaded to a Piscina worker thread pool so Sharp operations don't block the main event loop. The worker auto-orients the image based on EXIF metadata, runs the tool's process function, and returns the result. If the worker pool is unavailable, processing falls back to the main thread. 5. For AI tools, the TypeScript bridge sends a request to the persistent Python dispatcher (or spawns a fresh subprocess as fallback), waits for it to finish, and reads the output file. 6. Job progress is persisted to the `jobs` SQLite table so state survives container restarts. Real-time updates are delivered via SSE at `/api/v1/jobs/:jobId/progress`. 7. The API returns a `jobId` and `downloadUrl`. The user downloads the processed image from `/api/v1/download/:jobId/:filename`. For pipelines, the API feeds the output of each step as input to the next, running them sequentially. For batch processing, the API uses p-queue with a configurable concurrency limit (`CONCURRENT_JOBS`) and returns a ZIP file with all processed images. ## Resource footprint SnapOtter is designed for low idle memory use. Nothing is preloaded or kept warm at startup. ### At idle Only the Node.js/Fastify process is running. Typical idle RAM is **~100-150 MB** (Node.js process + SQLite connection). No Python process, no worker threads, no model weights in memory. ### What starts, and when | Component | Starts when | Memory while active | |-----------|-------------|---------------------| | Fastify server | Container start | ~100-150 MB | | Piscina worker threads | First standard tool request | Spawned on demand, terminated after **30 s idle** | | Python dispatcher | First AI tool request | Python interpreter + pre-imported libraries (PIL, NumPy, MediaPipe, rembg) - no model weights | | AI model weights | During the specific tool's request | Loaded from disk, freed when the request finishes | ### Model loading All model weight files (totalling several GB) sit on disk in `/opt/models/` at all times. Each AI tool script loads only its own model(s) into memory for the duration of a request, then releases them. Some scripts explicitly call `del model` and `torch.cuda.empty_cache()` after inference to ensure memory is returned immediately. There is no model cache between requests. Running the same AI tool back-to-back reloads the model each time. This keeps idle memory near zero at the cost of a model-load delay on every AI request. ### First AI request cold start The Python dispatcher is not running when the container starts. The first AI request triggers two things in parallel: the dispatcher starts warming up in the background, and the request itself falls back to a one-off Python subprocess spawn. Once the dispatcher signals ready, all subsequent AI requests use it directly and skip the subprocess spawn cost. --- --- url: 'https://docs.snapotter.com/guide/configuration.md' --- # Configuration All configuration is done through environment variables. Every variable has a sensible default, so SnapOtter works out of the box without setting any of them. ## Environment variables ### Server | Variable | Default | Description | |---|---|---| | `PORT` | `1349` | Port the server listens on. | | `RATE_LIMIT_PER_MIN` | `0` (disabled) | Maximum requests per minute per IP. Set to 0 to disable rate limiting. | | `CORS_ORIGIN` | (empty) | Comma-separated allowed origins for CORS, or empty for same-origin only. | | `LOG_LEVEL` | `info` | Log verbosity. One of: `fatal`, `error`, `warn`, `info`, `debug`, `trace`. | | `TRUST_PROXY` | `true` | Trust `X-Forwarded-For` headers from a reverse proxy. Set to `false` if not behind a proxy. | ### Authentication | Variable | Default | Description | |---|---|---| | `AUTH_ENABLED` | `false` | Set to `true` to require login. The Docker image defaults to `true`. | | `DEFAULT_USERNAME` | `admin` | Username for the initial admin account. Only used on first run. | | `DEFAULT_PASSWORD` | `admin` | Password for the initial admin account. Change this after first login. | | `MAX_USERS` | `0` (unlimited) | Maximum number of registered user accounts. Set to 0 for unlimited. | | `SESSION_DURATION_HOURS` | `168` | Login session lifetime in hours (default is 7 days). | | `SKIP_MUST_CHANGE_PASSWORD` | - | Set to any non-empty value to bypass the forced password-change prompt on first login | ### Storage | Variable | Default | Description | |---|---|---| | `STORAGE_MODE` | `local` | `local` or `s3`. Only local storage is currently implemented. | | `DB_PATH` | `./data/snapotter.db` | Path to the SQLite database file. | | `WORKSPACE_PATH` | `./tmp/workspace` | Directory for temporary files during processing. Cleaned up automatically. | | `FILES_STORAGE_PATH` | `./data/files` | Directory for persistent user files (uploaded images, saved results). | ### Processing limits | Variable | Default | Description | |---|---|---| | `MAX_UPLOAD_SIZE_MB` | `0` (unlimited) | Maximum file size per upload in megabytes. Set to 0 for unlimited. | | `MAX_BATCH_SIZE` | `0` (unlimited) | Maximum number of files in a single batch request. Set to 0 for unlimited. | | `CONCURRENT_JOBS` | `0` (auto) | Number of batch jobs that run in parallel. Set to 0 to auto-detect based on available CPU cores. | | `MAX_MEGAPIXELS` | `0` (unlimited) | Maximum image resolution allowed in megapixels. Set to 0 for unlimited. | | `MAX_WORKER_THREADS` | `0` (auto) | Maximum worker threads for image processing. Set to 0 to auto-detect based on available CPU cores. | | `PROCESSING_TIMEOUT_S` | `0` (no limit) | Maximum processing time per request in seconds. Set to 0 for no timeout. | | `MAX_PIPELINE_STEPS` | `0` (no limit) | Maximum number of steps in a pipeline. Set to 0 for no limit. | | `MAX_CANVAS_PIXELS` | `0` (no limit) | Maximum canvas size in pixels for output images. Set to 0 for no limit. | | `MAX_SVG_SIZE_MB` | `0` (unlimited) | Maximum SVG file size in megabytes. Set to 0 for unlimited. | | `MAX_LOGO_SIZE_KB` | `500` | Maximum custom branding logo size in kilobytes. | | `MAX_SPLIT_GRID` | `100` | Maximum grid dimension for the image split tool. | | `MAX_PDF_PAGES` | `0` (unlimited) | Maximum number of PDF pages for PDF-to-image conversion. Set to 0 for unlimited. | ### Cleanup | Variable | Default | Description | |---|---|---| | `FILE_MAX_AGE_HOURS` | `72` | How long temporary files are kept before automatic deletion. | | `CLEANUP_INTERVAL_MINUTES` | `60` | How often the cleanup job runs. | ### Appearance | Variable | Default | Description | |---|---|---| | `APP_NAME` | `SnapOtter` | Display name shown in the UI. | | `DEFAULT_THEME` | `light` | Default theme for new sessions. `light` or `dark`. | | `DEFAULT_LOCALE` | `en` | Default interface language. | ### Docker permissions | Variable | Default | Description | |---|---|---| | `PUID` | `999` | Run the container process as this UID. Set to match your host user for bind mounts (`id -u`). | | `PGID` | `999` | Run the container process as this GID. Set to match your host group for bind mounts (`id -g`). | ## Docker example ```yaml services: SnapOtter: image: snapotterhq/snapotter:latest ports: - "1349:1349" volumes: - SnapOtter-data:/data - SnapOtter-workspace:/tmp/workspace environment: - AUTH_ENABLED=true - DEFAULT_USERNAME=admin - DEFAULT_PASSWORD=changeme - MAX_UPLOAD_SIZE_MB=200 - CONCURRENT_JOBS=4 - FILE_MAX_AGE_HOURS=12 restart: unless-stopped ``` ## Volumes The Docker container uses two volumes: * `/data` -- Persistent storage for the SQLite database and user files. Mount this to keep users, API keys, saved pipelines, and uploaded images across container restarts. * `/tmp/workspace` -- Temporary storage for images being processed. This can be ephemeral, but mounting it avoids filling up the container's writable layer. --- --- url: 'https://docs.snapotter.com/guide/contributing.md' --- # Contributing Thanks for your interest in SnapOtter. Community feedback helps shape the project, and there are several ways to get involved. ## How to contribute The best way to contribute is through [GitHub Issues](https://github.com/snapotter-hq/snapotter/issues): * **Bug reports** - Found something broken? Open a bug report with steps to reproduce, your Docker setup, and what you expected to happen. * **Feature requests** - Have an idea for a new tool or improvement? Describe the problem you want solved and why it matters to you. * **Feedback** - Thoughts on the UI, workflow, documentation, or anything else? We want to hear it. ## Pull requests We do not accept pull requests. All development is handled internally to maintain architectural consistency and code quality across the project. If you have found a bug, open an issue describing it rather than submitting a fix. If you have a suggestion for how something should work, describe it in a feature request. Your input is valuable even without a code contribution. ## Forking You are welcome to fork the project for your own use under the terms of the [AGPLv3 license](https://github.com/snapotter-hq/snapotter/blob/main/LICENSE). The [Developer Guide](/guide/developer) covers setup, architecture, and how to add new tools. ## Security If you discover a security vulnerability, please report it privately through [GitHub Security Advisories](https://github.com/snapotter-hq/snapotter/security/advisories/new) rather than opening a public issue. --- --- url: 'https://docs.snapotter.com/guide/database.md' --- # Database SnapOtter uses SQLite with [Drizzle ORM](https://orm.drizzle.team/) for data persistence. The schema is defined in `apps/api/src/db/schema.ts`. The database file lives at the path set by `DB_PATH` (defaults to `./data/snapotter.db`). In Docker, mount the `/data` volume to persist it across container restarts. ## Tables ### users Stores user accounts. Created automatically on first run from `DEFAULT_USERNAME` and `DEFAULT_PASSWORD`. | Column | Type | Notes | |---|---|---| | `id` | integer | Primary key, auto-increment | | `username` | text | Unique, required | | `passwordHash` | text | bcrypt hash | | `role` | text | `admin` or `user` | | `mustChangePassword` | integer | Boolean flag for forced password reset | | `createdAt` | text | ISO timestamp | | `updatedAt` | text | ISO timestamp | ### sessions Active login sessions. Each row ties a session token to a user. | Column | Type | Notes | |---|---|---| | `id` | text | Primary key (session token) | | `userId` | integer | Foreign key to `users.id` | | `expiresAt` | text | ISO timestamp | | `createdAt` | text | ISO timestamp | ### teams Groups for organizing users. Admins can assign users to teams. | Column | Type | Description | |--------|------|-------------| | `id` | text UUID | Primary key | | `name` | text (unique, max 50 chars) | Team name | | `createdAt` | integer | Unix timestamp | ### api\_keys API keys for programmatic access. The raw key is shown once on creation; only the hash is stored. | Column | Type | Notes | |---|---|---| | `id` | integer | Primary key, auto-increment | | `userId` | integer | Foreign key to `users.id` | | `keyHash` | text | SHA-256 hash of the key | | `name` | text | User-provided label | | `createdAt` | text | ISO timestamp | | `lastUsedAt` | text | Updated on each authenticated request | Keys are prefixed with `si_` followed by 96 hex characters (48 random bytes). ### pipelines Saved tool chains that users create in the UI. | Column | Type | Notes | |---|---|---| | `id` | integer | Primary key, auto-increment | | `name` | text | Pipeline name | | `description` | text | Optional description | | `steps` | text | JSON array of `{ toolId, settings }` objects | | `createdAt` | text | ISO timestamp | ### user\_files Persistent file library with version chain tracking. Each processing step that saves a result creates a new row linked to its parent via `parentId`, forming a version tree. | Column | Type | Description | |--------|------|-------------| | `id` | text UUID | Primary key | | `userId` | text UUID | FK → users (CASCADE DELETE) | | `originalName` | text | Original upload filename | | `storedName` | text | Filename on disk | | `mimeType` | text | MIME type | | `size` | integer | File size in bytes | | `width` | integer | Image width in px | | `height` | integer | Image height in px | | `version` | integer | Version number (1 = original) | | `parentId` | text UUID | null | FK → user\_files (parent version) | | `toolChain` | text (JSON array) | Tool IDs applied in order to produce this version | | `createdAt` | integer | Unix timestamp | ### jobs Tracks processing jobs for progress reporting and cleanup. | Column | Type | Notes | |---|---|---| | `id` | text | Primary key (UUID) | | `type` | text | Tool or pipeline identifier | | `status` | text | `queued`, `processing`, `completed`, or `failed` | | `progress` | real | 0.0–1.0 fraction | | `inputFiles` | text | JSON array of input file paths | | `outputPath` | text | Path to the result file | | `settings` | text | JSON of the tool settings used | | `error` | text | Error message if failed | | `createdAt` | text | ISO timestamp | | `completedAt` | text | ISO timestamp | ### settings Key-value store for server-wide settings that admins can change from the UI. | Column | Type | Notes | |---|---|---| | `key` | text | Primary key | | `value` | text | Setting value | | `updatedAt` | text | ISO timestamp | ## Migrations Drizzle handles schema migrations. The config is in `apps/api/drizzle.config.ts`. During development, run: ```bash pnpm --filter @snapotter/api drizzle-kit push ``` In production, the schema is applied automatically on startup. --- --- url: 'https://docs.snapotter.com/guide/deployment.md' --- # Deployment SnapOtter ships as a single Docker container. The image supports **linux/amd64** (with NVIDIA CUDA) and **linux/arm64** (CPU), so it runs natively on Intel/AMD servers, Apple Silicon Macs, and ARM devices like the Raspberry Pi 4/5. See [Docker Image](./docker-tags) for GPU setup, Docker Compose examples, and version pinning. ## Quick Start (CPU) ```yaml # docker-compose.yml — Copy this file and run: docker compose up -d services: SnapOtter: image: snapotterhq/snapotter:latest # or ghcr.io/snapotter-hq/snapotter:latest container_name: SnapOtter ports: - "1349:1349" # Web UI + API volumes: - SnapOtter-data:/data # Database, AI models, user files (PERSISTENT) - SnapOtter-workspace:/tmp/workspace # Temp processing files (can be tmpfs) environment: # --- Authentication --- - AUTH_ENABLED=true # Set to false to disable login entirely - DEFAULT_USERNAME=admin # First-run admin username - DEFAULT_PASSWORD=admin # First-run admin password (you'll be forced to change it) # --- Limits (0 = unlimited) --- # - MAX_UPLOAD_SIZE_MB=0 # Per-file upload limit in MB # - MAX_BATCH_SIZE=0 # Max files per batch request # - RATE_LIMIT_PER_MIN=0 # API rate limit (0 = disabled, 100 = recommended for public) # - MAX_USERS=0 # Max user accounts # --- Networking --- # - TRUST_PROXY=true # Trust X-Forwarded-For headers (set false if not behind a proxy) # --- Bind mount permissions --- # - PUID=1000 # Match your host user's UID (run: id -u) # - PGID=1000 # Match your host user's GID (run: id -g) restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:1349/api/v1/health"] interval: 30s timeout: 5s start_period: 60s retries: 3 shm_size: "2gb" # Needed for Python ML shared memory logging: driver: json-file options: max-size: "10m" max-file: "3" volumes: SnapOtter-data: # Named volume — Docker manages permissions automatically SnapOtter-workspace: ``` ```bash docker compose up -d ``` The app is then available at `http://localhost:1349`. > **Docker Hub rate limits?** Replace `snapotterhq/snapotter:latest` with `ghcr.io/snapotter-hq/snapotter:latest` to pull from GitHub Container Registry instead. Both registries receive the same image on every release. ## Quick Start (GPU) For NVIDIA GPU acceleration on AI tools (background removal, upscaling, face enhancement, OCR): ```yaml # docker-compose-gpu.yml — Requires: NVIDIA GPU + nvidia-container-toolkit # Install toolkit: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html services: SnapOtter: image: snapotterhq/snapotter:latest container_name: SnapOtter ports: - "1349:1349" volumes: - SnapOtter-data:/data - SnapOtter-workspace:/tmp/workspace environment: - AUTH_ENABLED=true - DEFAULT_USERNAME=admin - DEFAULT_PASSWORD=admin restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:1349/api/v1/health"] interval: 30s timeout: 5s start_period: 60s retries: 3 shm_size: "2gb" # Required for PyTorch CUDA shared memory deploy: resources: reservations: devices: - driver: nvidia count: all # Or set to 1 for a specific GPU capabilities: [gpu] logging: driver: json-file options: max-size: "10m" max-file: "3" volumes: SnapOtter-data: SnapOtter-workspace: ``` ```bash docker compose -f docker-compose-gpu.yml up -d ``` Check GPU detection in the logs: ```bash docker logs SnapOtter 2>&1 | head -20 # Look for: [INFO] GPU detected — AI tools will use CUDA acceleration ``` ## Hardware Requirements These numbers come from benchmarks run across four systems (Apple M2 Max, AMD Ryzen 5 7500F + RTX 4070, Intel i7-7600U, Docker Desktop on Windows). See `docs/HARDWARE_RECOMMENDATIONS.md` in the repo for full methodology and raw data. ### Quick Reference | Tier | Use Case | CPU | RAM | GPU | Storage | |------|----------|-----|-----|-----|---------| | Minimum | Core tools, single user | 1 core | 1 GB | None | 5 GB | | Recommended | All tools + AI on CPU | 4 cores | 4 GB | None | 20 GB | | Full | All tools + AI on GPU | 4+ cores | 8 GB | NVIDIA 8 GB+ | 30 GB | ### Minimum (core image tools) | Resource | Requirement | |---|---| | CPU | 1 core | | RAM | 1 GB | | Disk | 3 GB (image) + 1 GB (data volume) | | GPU | Not required | All 35 non-AI tools (resize, crop, rotate, convert, compress, watermark, collage, etc.) run on any hardware. Most operations complete in under 1 second even on a single core. The exception is AVIF encoding, which takes ~27s on 1 core but drops to ~5s on 4 cores. ```yaml deploy: resources: limits: cpus: '1' memory: 1G ``` ### Recommended (AI tools on CPU) | Resource | Requirement | |---|---| | CPU | 4 cores | | RAM | 4 GB | | Disk | 3 GB (image) + 14 GB (AI models) + workspace | | GPU | Not required (CPU fallback) | AI tools work on CPU but are significantly slower. Some tools are practical on CPU, others are not: | AI Tool | CPU Time | Usable? | |---|---|---| | blur-faces, smart-crop, red-eye-removal | 2-5s | Yes | | remove-background | 37-41s | Marginal (long wait) | | upscale (small image) | 22s | Marginal | | upscale (large image) | 241s | No | | enhance-faces, colorize, noise-removal | 30-90s | Marginal to No | AI model download sizes: | Bundle | Disk Size | |---|---| | Background removal | 3-4 GB | | Upscale + Face enhance + Noise removal | 4-5 GB | | Face detection | 200-300 MB | | Object eraser + Colorize | 1-2 GB | | OCR | 3-4 GB | | Photo restoration | 800 MB - 1 GB | | **All bundles** | **~14 GB** | ```yaml deploy: resources: limits: cpus: '4' memory: 4G ``` ### Full (AI tools on GPU) | Resource | Requirement | |---|---| | CPU | 4+ cores | | RAM | 8 GB | | GPU | NVIDIA with 8+ GB VRAM (12 GB recommended) | | Disk | 30 GB total | GPU acceleration gives 3-13,000x speedup depending on the operation. Measured on an RTX 4070 vs Intel i7-7600U: | AI Tool | GPU Time | CPU Time | Speedup | |---|---|---|---| | noise-removal (quick) | 17ms | 228s | 13,400x | | blur-faces | 0.27s | 27s | 100x | | upscale 2x | 6.3s | >300s (timeout) | 47x+ | | enhance-faces (GFPGAN) | 2.3s | 28s | 12x | | remove-background | 5-10s | 21-41s | 3-8x | | OCR (best) | 70s | 243s | 3.5x | | restore-photo | 31s | 90s | 2.9x | | colorize | 10s | 13s | 1.3x | Peak VRAM usage reaches 7.5 GB during upscale with face enhancement. A 6 GB GPU works for most AI tools individually but will fail on upscale. 8-12 GB VRAM handles everything. ```yaml deploy: resources: limits: cpus: '4' memory: 8G reservations: devices: - driver: nvidia count: all capabilities: [gpu] ``` ### Concurrent Users Benchmarked with parallel resize requests on a large image (Mac M2 Max, 10 Docker CPUs): | Concurrent Users | Avg Response Time | Errors | |---|---|---| | 1 | 0.28s | 0 | | 5 | 0.54s | 0 | | 10 | 1.08s | 0 | | 20 | 2.10s | 0 | The server scales linearly with no errors or crashes up to 20 concurrent requests. ### Supported Image Formats | Format | Read | Write | Notes | |---|---|---|---| | JPEG | Yes | Yes | | | PNG | Yes | Yes | | | WebP | Yes | Yes | | | AVIF | Yes | Yes | Encode is CPU-intensive (~5s on 4 cores for a large image) | | GIF | Yes | Yes | Animated GIF supported | | TIFF | Yes | Yes | Multi-page supported | | SVG | Yes | No | Rasterized on input, sanitized for security | | HEIC | Yes | No | Decoded via heif-dec (~0.4s) | | HEIF | Yes | No | Very slow decode (~15s) | | DNG (RAW) | Yes (Linux) | No | Decoded via dcraw, not available on macOS | | PSD | Yes | No | Decoded via ImageMagick | | HDR | Yes | No | Tone-mapped on decode | | TGA | Yes | No | Decoded via ImageMagick | | ICO | Yes | Yes | Via favicon tool | | PDF | Yes | Yes | Via pdf-to-image / image-to-pdf tools | Not supported: BMP (V4/V5 headers), JPEG XL (JXL), EXR (missing decode delegate in Docker image). ### Known Limitations * **Content-aware resize** crashes on large images (>5 MP) due to a limitation in the caire binary. Works fine with smaller images. * **HEIF decode** takes 13-23 seconds. HEIC (Apple's variant) is much faster at 0.3-0.9 seconds. * **OCR Japanese** fails on CPU due to a PaddlePaddle MKLDNN bug. Works on GPU. * **Upscale** times out on CPU for anything beyond small images. GPU required for practical use. * **CodeFormer** face enhancement is significantly slower than GFPGAN (53s vs 2s on GPU). GFPGAN is recommended for most use cases. ## Volumes | Mount | Purpose | Required? | |---|---|---| | `/data` | SQLite database, AI models, Python venv, user files | **Yes** — data loss without it | | `/tmp/workspace` | Temporary processing files (auto-cleaned) | Recommended | ### Bind mounts vs. named volumes **Named volumes** (recommended) — Docker manages permissions automatically: ```yaml volumes: - SnapOtter-data:/data ``` **Bind mounts** — You manage permissions. Set `PUID`/`PGID` to match your host user: ```yaml volumes: - ./SnapOtter-data:/data environment: - PUID=1000 # Your host UID (run: id -u) - PGID=1000 # Your host GID (run: id -g) ``` ## Environment Variables | Variable | Default | Description | |---|---|---| | `AUTH_ENABLED` | `true` | Enable/disable login requirement | | `DEFAULT_USERNAME` | `admin` | Initial admin username | | `DEFAULT_PASSWORD` | `admin` | Initial admin password (forced change on first login) | | `MAX_UPLOAD_SIZE_MB` | `0` (unlimited) | Per-file upload limit | | `MAX_BATCH_SIZE` | `0` (unlimited) | Max files per batch request | | `RATE_LIMIT_PER_MIN` | `0` (disabled) | API requests per minute per IP | | `MAX_USERS` | `0` (unlimited) | Maximum user accounts | | `TRUST_PROXY` | `true` | Trust X-Forwarded-For headers from reverse proxy | | `PUID` | `999` | Run as this UID (for bind mount permissions) | | `PGID` | `999` | Run as this GID (for bind mount permissions) | | `LOG_LEVEL` | `info` | Log verbosity: fatal, error, warn, info, debug, trace | | `CONCURRENT_JOBS` | `0` (auto) | Max parallel AI processing jobs | | `SESSION_DURATION_HOURS` | `168` | Login session lifetime (7 days) | | `CORS_ORIGIN` | (empty) | Comma-separated allowed origins, or empty for same-origin | ## Health Check The container includes a built-in health check: ```bash # Check container health status docker inspect --format='{{.State.Health.Status}}' SnapOtter # Manual health check curl http://localhost:1349/api/v1/health # {"status":"healthy","version":"1.15.9"} ``` ## Reverse Proxy SnapOtter sets `TRUST_PROXY=true` by default so rate limiting and logging use the real client IP from `X-Forwarded-For` headers. ### Nginx ```nginx server { listen 80; server_name images.example.com; # Match MAX_UPLOAD_SIZE_MB (0 = nginx default 1M, so set high for unlimited) client_max_body_size 500M; location / { proxy_pass http://localhost:1349; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # SSE support (batch progress, feature install progress) proxy_buffering off; proxy_read_timeout 300s; } } ``` ### Nginx Proxy Manager 1. Add a new Proxy Host 2. Set Domain Name to your domain 3. Set Scheme to `http`, Forward Hostname to `SnapOtter` (or your container IP), Forward Port to `1349` 4. Enable WebSocket support 5. Under Advanced, add: `client_max_body_size 500M;` and `proxy_buffering off;` ### Traefik ```yaml # Add these labels to the SnapOtter service in docker-compose.yml labels: - "traefik.enable=true" - "traefik.http.routers.snapotter.rule=Host(`images.example.com`)" - "traefik.http.routers.snapotter.entrypoints=websecure" - "traefik.http.routers.snapotter.tls.certresolver=letsencrypt" - "traefik.http.services.snapotter.loadbalancer.server.port=1349" # Increase upload limit (default 2MB is too low) - "traefik.http.middlewares.snapotter-body.buffering.maxRequestBodyBytes=524288000" - "traefik.http.routers.snapotter.middlewares=snapotter-body" ``` ### Cloudflare Tunnels ```bash cloudflared tunnel --url http://localhost:1349 ``` Note: Cloudflare has a 100 MB upload limit on free plans. Set `MAX_UPLOAD_SIZE_MB=100` to match. ## CI/CD The GitHub repository has three workflows: * **ci.yml** -- Runs automatically on every push and PR. Lints, typechecks, tests, builds, and validates the Docker image (without pushing). * **release.yml** -- Triggered manually via `workflow_dispatch`. Runs semantic-release to create a version tag and GitHub release, then builds a multi-arch Docker image (amd64 + arm64) and pushes to Docker Hub (`snapotterhq/snapotter`) and GitHub Container Registry (`ghcr.io/snapotter-hq/snapotter`). * **deploy-docs.yml** -- Builds this documentation site and deploys it to GitHub Pages on push to `main`. To create a release, go to **Actions > Release > Run workflow** in the GitHub UI, or run: ```bash gh workflow run release.yml ``` Semantic-release determines the version from commit history. The `latest` Docker tag always points to the most recent release. --- --- url: 'https://docs.snapotter.com/guide/developer.md' --- # Developer guide How to set up a local development environment and contribute code to SnapOtter. ## Prerequisites * [Node.js](https://nodejs.org/) 22+ * [pnpm](https://pnpm.io/) 9+ (`corepack enable && corepack prepare pnpm@latest --activate`) * [Docker](https://www.docker.com/) (for container builds and AI features) * Git Python 3.10+ is only needed if you are working on the AI/ML sidecar (background removal, upscaling, OCR). ## Setup ```bash git clone https://github.com/snapotter-hq/snapotter.git cd snapotter pnpm install pnpm dev ``` This starts two dev servers: | Service | URL | Notes | |----------|--------------------------|------------------------------------| | Frontend | http://localhost:1349 | Vite dev server, proxies /api | | Backend | http://localhost:13490 | Fastify API (accessed via proxy) | Open http://localhost:1349 in your browser. Login with `admin` / `admin`. You will be prompted to change the password on first login. ## Project structure ``` apps/ api/ Fastify backend web/ Vite + React frontend docs/ VitePress documentation (this site) packages/ shared/ Constants, types, i18n strings image-engine/ Sharp-based image operations ai/ Python sidecar bridge for ML models tests/ unit/ Vitest unit tests integration/ Vitest integration tests (full API) e2e/ Playwright end-to-end specs fixtures/ Small test images ``` ## Commands ```bash pnpm dev # start frontend + backend pnpm build # build all workspaces pnpm typecheck # TypeScript check across monorepo pnpm lint # Biome lint + format check pnpm lint:fix # auto-fix lint + format pnpm test # unit + integration tests pnpm test:unit # unit tests only pnpm test:integration # integration tests only pnpm test:e2e # Playwright e2e tests pnpm test:coverage # tests with coverage report ``` ## Code conventions * Double quotes, semicolons, 2-space indentation (enforced by Biome) * ES modules in all workspaces * [Conventional commits](https://www.conventionalcommits.org/) for semantic-release * Zod for all API input validation * No modifications to Biome, TypeScript, or editor config files. Fix the code, not the linter. ## Database SQLite via Drizzle ORM. The database file lives at `./data/snapotter.db` by default. ```bash cd apps/api npx drizzle-kit generate # generate a migration from schema changes npx drizzle-kit migrate # apply pending migrations ``` Schema is defined in `apps/api/src/db/schema.ts`. Tables: users, sessions, settings, jobs, apiKeys, pipelines, teams, userFiles. ## Adding a new tool Every tool follows the same pattern. Here is a minimal example. ### 1. Backend route Create `apps/api/src/routes/tools/my-tool.ts`: ```ts import { z } from "zod"; import type { FastifyInstance } from "fastify"; import { createToolRoute } from "../tool-factory.js"; const settingsSchema = z.object({ intensity: z.number().min(0).max(100).default(50), }); export function registerMyTool(app: FastifyInstance) { createToolRoute(app, { toolId: "my-tool", settingsSchema, async process(inputBuffer, settings, filename) { // Use sharp or other libraries to process the image const sharp = (await import("sharp")).default; const result = await sharp(inputBuffer) // ... your processing logic .toBuffer(); return { buffer: result, filename: filename.replace(/\.[^.]+$/, ".png"), contentType: "image/png", }; }, }); } ``` Then register it in `apps/api/src/routes/tools/index.ts`. ### 2. Frontend settings component Create `apps/web/src/components/tools/my-tool-settings.tsx`: ```tsx import { useState } from "react"; import { useToolProcessor } from "@/hooks/use-tool-processor"; import { useFileStore } from "@/stores/file-store"; export function MyToolSettings() { const { files } = useFileStore(); const { processFiles, processing, error, downloadUrl } = useToolProcessor("my-tool"); const [intensity, setIntensity] = useState(50); const handleProcess = () => { processFiles(files, { intensity }); }; return (