Build a Real-Time Semantic Chat App with InsForge & pgvector

Most chat applications stop at keyword search. Developers now need semantic retrieval, real-time synchronization, and AI native infrastructure without stitching multiple providers together. Traditional stacks require manual setup for database, real-time messaging, vector search, hosting, and model integration. This slows iteration and reduces reliability when working with AI coding agents.

InsForge connects coding agents directly to a production backend through MCP. Benchmarks show measurable improvements. Agents complete backend tasks 1.6x faster, use 2.4x fewer tokens.

This tutorial builds a real-time semantic chat application using Gemini embeddings, pgvector, and InsForge from development to production.

What We Are Building

The Use Case: This application enables semantic messaging, where users can search conversations by meaning instead of exact keywords. It turns chat history into structured, vector-searchable memory using Gemini embeddings and pgvector. The same architecture can power team collaboration tools, support systems, and AI native knowledge platforms.

Here is what we will be building:

We are building a room based realtime chat application using InsForge as the backend platform.
Messages propagate instantly across connected clients through InsForge Realtime using WebSocket subscriptions.
Each message is converted into a 768-dimensional Gemini embedding and stored in PostgreSQL using the pgvector extension.
Semantic search retrieves contextually similar messages within a room using cosine distance over vector embeddings.
The backend infrastructure is agent operable through MCP, which allows AI coding agents to inspect and modify the database schema, extensions, triggers, and configuration directly from the IDE.
We use the InsForge CLI to link the local Next.js project to the InsForge backend, manage environment configuration, and deploy the full-stack application to InsForge hosting.

Step 1: Connect InsForge MCP

Create a new InsForge project from the dashboard.

Install InsForge MCP in the IDE and connect it to the project.

Confirm the MCP Connected indicator before proceeding.

This enables direct agent-level control over:

Database schema
Extensions
Realtime configuration
Functions and triggers

No manual SQL console work is required.

You can also use the InsForge CLI to link the local project. This connects the local codebase to the backend project for inspection and deployment.

Step 2: Enable pgvector

pgvector enables native vector storage and similarity search inside PostgreSQL.

MCP Prompt

text

Enable the PostgreSQL vector extension in this project.

Expected SQL:

sql

CREATE EXTENSION IF NOT EXISTS vector;

Step 3: Create Messages Table

Each message stores content and a Gemini embedding vector.

Gemini model used: embedding-001

Vector dimension: 768

MCP Prompt

text

Create a table called messages with:

- id uuid primary key default gen_random_uuid()
- room_id text not null
- content text not null
- embedding vector(768)
- created_at timestamptz default now()

Verify Schema

text

Show me the messages table schema.

Step 4: Configure Realtime Channel

Realtime channels define how messages are scoped per room.

The pattern room:% enables dynamic room subscriptions.

MCP Prompt

text

Insert a realtime channel pattern:

pattern: 'room:%'
description: 'Room based chat events'
enabled: true

Step 5: Create Realtime Trigger Function

PostgreSQL must publish events when a message is inserted.

This function calls realtime.publish().

MCP Prompt

text

Create a PostgreSQL function called notify_new_message.

It should call realtime.publish(
  'room:' || NEW.room_id,
  'NEW_message',
  jsonb_build_object(
    'id', NEW.id,
    'room_id', NEW.room_id,
    'content', NEW.content,
    'created_at', NEW.created_at
  )
)

Language plpgsql.
Security definer.
Return NEW.

Step 6: Attach AFTER INSERT Trigger

This connects database writes to realtime broadcast.

MCP Prompt

text

Create an AFTER INSERT trigger on messages table
that executes notify_new_message.

Step 7: Validate Realtime Pipeline

Insert a test message directly via MCP.

MCP Prompt

text

Insert a test message into messages table with:

room_id = 'test-room'
content = 'Hello world'
embedding = null

Then verify:

text

Show last 5 rows from realtime.messages.

If NEW_message appears, the trigger and realtime pipeline are active.

Step 8: Create Semantic Similarity Function

Semantic search requires cosine similarity over vectors.

Distance operator used: <=>

Metric: Cosine distance

Model: Gemini embedding-001

Dimension: 768

MCP Prompt

text

Create a SQL function called match_messages.

Parameters:
- query_embedding vector(768)
- match_room text

It should:
- Return id, content, created_at
- Filter by room_id = match_room
- Order by embedding <=> query_embedding
- Limit 5

Step 9: Add HNSW Index

HNSW enables fast approximate nearest neighbor search.

Operator class used: vector_cosine_ops

MCP Prompt

text

Create an HNSW index on messages.embedding
using vector_cosine_ops.

Inspecting with CLI (Optional)

The InsForge CLI can be used to:

Inspect schema
Check triggers
Verify indexes
Manage environment configuration
Deploy the application

Examples:

bash

# Run any SQL query
insforge db query "SELECT * FROM posts ORDER BY created_at DESC LIMIT 10"

# Explore your schema
insforge db tables
insforge db indexes
insforge db policies
insforge db triggers
insforge db functions

Refer detailed CLI commands here.

Step 10: Build the Application with Next.js

Use an AI coding agent to generate the Next.js application. The agent creates a Next.js 15 project with the App Router and TypeScript and connects it to the InsForge backend.

The frontend UI lives in page.tsx as a client component, while all embedding logic runs inside server-side API routes.

Key structure:

page.tsx handles UI, room switching, and WebSocket subscription.
/api/send generates a Gemini embedding using embedding-001 and inserts the message into PostgreSQL.
/api/search embeds the query and calls match_messages for cosine similarity ranking.
/api/messages/[roomId] fetches the latest 50 messages for a room.

Create a .env.local file with:

text

GEMINI_API_KEY=
NEXT_PUBLIC_INSFORGE_URL=
NEXT_PUBLIC_INSFORGE_ANON_KEY=

The Gemini key stays server-side.

The NEXT_PUBLIC_ keys allow the browser to connect to InsForge Realtime.

Refer to the repository for the full source code and structure.

Step 11: Deploy with InsForge CLI

Link the local project to the InsForge backend:

bash

insforge login
insforge link

Deploy the full-stack application:

bash

insforge deployments deploy

This deploys:

Next.js frontend
API routes
Environment configuration

The application becomes accessible via the InsForge-hosted domain.

Step 12: Test the Application

Normal Flow Test

Open the deployed URL.
Join the default room general.
Send a message.
The message appears instantly in the chat panel.
The message is not manually appended in the UI. It arrives through the WebSocket NEW_message event.

This confirms:

Gemini embedding executed
Message inserted into PostgreSQL
Trigger fired
Real-time broadcast delivered

Two-Window Realtime Test

Open the app in two browser windows.
Join the same room in both.
Send a message from one window.
The second window receives the message immediately without refresh.

This validates the room-scoped channel pattern room:<roomId> and InsForge Realtime WebSocket subscription.

Semantic Search Test

Send multiple messages with different topics.
Enter a semantic query in the search panel.
Results appear ranked with similarity scores between 0.0 and 1.0.

The query flow:

Query embedding generated with Gemini embedding-001
match_messages RPC executed
Cosine similarity computed using <=>
Results returned ordered by relevance

Inspecting in InsForge Dashboard

Open the InsForge dashboard and inspect:

messages table to confirm embeddings are stored as vector(768)
realtime.messages to verify NEW_message events

SQL editor to run cosine similarity queries manually

Functions tab to confirm match_messages
Indexes to confirm HNSW index exists

This verifies the full pipeline from embedding to broadcast to semantic retrieval.

Access the deployed live demo here.

Recap

MCP enables AI agents to control the backend directly from the IDE, including schema changes and infrastructure configuration.
Realtime messaging is database-driven, using PostgreSQL as the source of truth and InsForge for WebSocket delivery.
pgvector provides native semantic search through cosine similarity over 768-dimensional embeddings.
InsForge consolidates database, real-time, hosting, and deployment into a single operational platform.
The result is a prompt to production workflow optimized for AI-assisted development.

Conclusion

We built a real-time semantic chat application that combines PostgreSQL, pgvector, Gemini embeddings, and InsForge Realtime into a single cohesive system. We stored vector embeddings directly in Postgres, executed cosine similarity search inside the database, and broadcast messages instantly using database-driven real-time.

We managed schema, configuration, and deployment through MCP and the InsForge CLI without manual backend setup. This project demonstrates how AI coding agents and InsForge together reduce friction between idea and production.

If you are building AI native applications, try InsForge. It provides managed PostgreSQL, pgvector support, real-time messaging, authentication, storage, edge functions, MCP-based agent integration, and integrated hosting in one platform.

Connect your coding agent, define your backend in natural language, and deploy with the CLI in minutes. Start building real-time AI-powered applications without stitching multiple services together.

Try InsForge
Quickstart guide here