How Zeabur shipped a RAG-powered forum on InsForge | InsForge

Forum traffic was outpacing the team. Zeabur shipped a RAG-based system on InsForge without standing up a separate vector database or wiring up a model provider.

About Zeabur

Zeabur helps teams deploy and manage apps without having to do day-to-day DevOps work. Developers can ship frontend and backend services, provision core resources like Postgres and Redis, and manage environments such as preview, staging, and production from one place.

The Challenge

Forum traffic was rising faster than the team could keep up. The same questions repeated, but the answers were buried across docs and old threads. Each unanswered post created more support load and more duplicated work.

Zeabur Forum Latest Posts list showing recurring deployment and infrastructure questions

The fix was obvious in theory: a RAG-powered answer system grounded in Zeabur's own docs and previously accepted forum answers. The team wanted to evaluate it quickly. In practice, even a "simple" RAG evaluation takes significant time. It takes nontrivial effort to set up a vector database, build an embeddings pipeline, configure an LLM provider with proper key management, wire the end-to-end pipeline, and then iterate on the system itself through multiple rounds of testing (chunking strategies, retrieval methods, reranking, and prompting) just to get something that works.

"Our forum question volume has surged, and the importance of RAG became obvious. With it, answering forum questions is going to be far more efficient. We wanted to work with a team that could quickly help us deploy a RAG that's smart and useful enough to actually help us."

Yuanlin Lin, Founder & CEO, Zeabur

Zeabur did not want to spend more engineering time evaluating infrastructure options or tuning the RAG algorithm. They needed an accurate RAG pipeline as soon as possible to reduce support load.

Why InsForge

When Zeabur evaluated RAG, InsForge was the fastest way to get to a credible test.

1. Everything in one place (embeddings + Postgres + pgvector) for agents

Embeddings, chat completions, and reranking run through InsForge's AI Model Gateway, and vector search runs in the same managed Postgres via pgvector. Zeabur didn't need to sign up for multiple providers or stand up a separate vector database just to run an evaluation. Once they connected Claude Code to InsForge, InsForge could spin up the end-to-end pipeline in one go.

2. Agent-friendly iteration (schemas + experiments)

InsForge exposes the full surface area to agents through Skills + CLI, so an agent can set up the pipeline end-to-end and quickly try different schemas and retrieval setups. That made it easy to run multiple experiments (chunk sizes, top-K, reranking, and prompts) and converge on what worked.

"Staying on one platform was the unlock. We could iterate and evaluate on RAG without coordinating four separate systems. All it takes is a couple of prompts."

Yuanlin Lin, Founder & CEO, Zeabur

The Solution

Zeabur built the forum's RAG layer directly on InsForge.

Zeabur integrated a RAG pipeline into their existing forum agent so it can pull the best context (docs + accepted answers) before responding.

Threads, replies, users, and tags live in standard Postgres tables. Docs pages and accepted forum answers are chunked, embedded through the AI Model Gateway, and stored in a vector column alongside the source row. For each new question, the forum runs a single SQL query to retrieve the top-K relevant chunks from the same Postgres database that backs the forum.

When a user posts a question, the forum retrieves the most relevant chunks and feeds them into Zeabur's agent. Grounded in that context (and patterns from previously resolved threads), the agent can answer questions more accurately and consistently.

"People get accurate answers immediately, and our team can focus on development work instead of repeating the same replies."

Yuanlin Lin, Founder & CEO, Zeabur

Zeabur Agent answering a service support question with grounded diagnostic context

By the Numbers

4,500 queries. 7,197 chunks. One agent-native backend.

Thirty days after launch, the InsForge-backed RAG stack is doing real production work:

~4,500 RAG queries served in 30 days. Every Zeabur Agent answer grounded in the live knowledge base, not a stale model snapshot.
150 queries/day on average, 205 at peak. Steady production traffic, not a demo.
7,197 indexed chunks across four sources: 3,004 from ~2,700 forum posts, 2,614 from docs, 956 from blog, 623 from changelog. All sitting in a single vector column.
82% Zeabur Agent (user-facing), 15% Claude Code (internal dev). The same retrieval stack serves Zeabur's customers and Zeabur's own engineers building the product.
Zero side-car vector DBs. Postgres, embeddings, model gateway, and auth all running on InsForge.

What's Next

Zeabur is continuing to improve the data behind the system: better chunking and embeddings, cleaner sources, and faster re-indexing so the agent gets stronger context on every question. Next, they plan to lean more on agents to handle more than Q&A. Triage, drafting, and follow-ups, all built on InsForge with coding agents.