Architecture

How GemMate Works

Smart routing between on-device and local GPU inference

๐Ÿ“ฑ
GemMate App
Chat ยท Flashcards ยท Quizzes ยท Mind Maps
๐Ÿ”€
Smart Router
Auto-selects the best AI model
๐Ÿง 
On-Device
Gemma 4 E2B
Runs via LiteRT-LM. Offline, 3-8s.
๐Ÿ’ป
LAN GPU
Gemma 4 E4B
Via Ollama on laptop. <1s response.

Smart Router

Condition Model Latency
WiFi + Laptop availableGemma 4 E4B via Ollama< 1s
No WiFi, model installedGemma 4 E2B on-device3-8s
WiFi + No laptopGemma 4 E2B on-device3-8s
No WiFi, no modelPrompt to downloadโ€”
Tech Stack

Built With Modern Tools

๐Ÿค–

AI Model

Gemma 4 E2B / E4B

โšก

Runtime

LiteRT-LM (on-device) + Ollama (GPU)

๐Ÿ’Ž

Framework

Flutter 3.41 / Dart

๐Ÿงฎ

Study Algorithm

SM-2 Spaced Repetition

๐Ÿ‘๏ธ

OCR / Vision

ML Kit (offline) + Gemma 4 multimodal

๐Ÿ’พ

Storage

SharedPreferences + JSON

๐Ÿ›ก๏ธ

Privacy by Design

Your data never leaves your device. No cloud. No tracking. No API keys. Open source under Apache 2.0.

โœ“
Zero data collection โ€” all processing stays on your device
โœ“
No API keys needed โ€” Gemma 4 runs locally
โœ“
Open source โ€” audit the code yourself on GitHub