Long-running side project
Telegram bot: from local chat utility to agentic system
The first commit in this repository was on July 16, 2015. It started as a collection of practical features for local chats and turned into an agentic production system: async AWS workers, a reply gate, model fallback, tool execution, memory, metrics, live stats and a separate UI.
Why it matters
The interesting part is not that it calls an LLM. The useful part is the system around the model: routing, context construction, tool safety, memory, telemetry, UI feedback and production boundaries. Those are the patterns I keep running into when turning AI features into real product workflows.
- Takes a messy real-world workflow and separates it into deployable production boundaries.
- Balances user experience, cost, latency and reliability in an AI-heavy product surface.
- Turns model behavior into concrete routing, fallback, telemetry and tool-execution decisions.
- Keeps improving the system through operational feedback instead of treating the prototype as the finish line.
Architecture
The webhook is intentionally small. It classifies incoming Telegram updates and dispatches async Lambda invokes to workers that own the real work, including replies, stats fanout, search and PNG rendering.
Project facts
Agent loop
The agent path is built as a product workflow, not as one giant prompt. Each stage has a clear responsibility and observable failure mode.
Available tools include web search, image search, image generation, voice generation, weather, code execution, history lookup, memory updates, randomization, GIF/video search and dynamic command creation.
How it evolved
2015: local chat utility
The repository started as a practical set of small features for local Telegram chats: helpers, jokes, commands, currency, weather and search-like utilities.
Command platform
The bot accumulated deterministic commands and integrations, then needed a cleaner command registry and shared helpers instead of one growing handler.
Production split
The webhook became a thin ingress lambda. Slow work moved into async workers so Telegram requests return quickly and each concern can scale independently.
Stats and UI
Chat events moved into DynamoDB. A WebSocket runtime, search endpoint and PNG renderer made statistics, activity charts and currency cards visible in Telegram and the companion UI.
Agentic layer
The bot gained a reply gate, model loop, tools, memory, recent history, multimodal context, dynamic commands and fallback behavior for model calls.
Engineering decisions
Thin webhook ingress
The Telegram-facing lambda only routes and invokes workers. It does not wait for statistics, model calls, image generation or WebSocket fanout.
Reply gate before the agent
Group chats need a default-ignore policy. Deterministic address checks run first, then a structured model decision decides whether the bot should engage.
Tool loop with guardrails
Tools are exposed through a typed registry, rate-limited tools can run sequentially, content tools are deferred until data-gathering tools finish, and every tool call has a timeout.
Provider-aware model fallback
Model calls record success, timeout and error states. The main chat model can fall back to another provider/model when the primary path fails.
Memory and history as product primitives
Recent chat history, media attachments and scoped memory are loaded only after the reply gate confirms that a response is useful.
Metrics over guesses
Model calls and tool calls write time-series metrics with status, latency, model name and fallback source. That makes failure modes visible.
What I would improve next
The system already has tests and runtime metrics. The next useful step is stronger evals: replay known conversations, preserve the failure modes, and make model or prompt changes measurable before they reach users.
- Build a replay/evaluation harness from real conversations with redacted fixtures.
- Track answer quality, refusal quality, tool success and latency by chat and feature.
- Add admin review flows for dynamic tools and memory changes.
- Promote the strongest patterns into reusable templates for other bots or agent systems.