Debugging on InsForge: New CLI Diagnostics & Agent-Powered Troubleshooting

04 Apr 20263 minute
Junwen Feng

Junwen Feng

Engineer

InsForge CLI

Over the past three months, we've been closely following the issues developers share in our Discord community. We noticed a recurring pattern: many problems — slow queries, deployment failures, auth errors, mysterious 500s — could be diagnosed faster if developers (and their AI agents) had better visibility into what's happening on the backend.

So we built it. This post covers four new capabilities we've shipped to help you debug faster:

  1. Infrastructure observability API — real-time EC2 instance metrics
  2. Backend Advisor — automated daily health scans
  3. diagnose CLI command — one-stop diagnostics from your terminal
  4. insforge-debug skill — agent-native troubleshooting for AI coding assistants

1. Infrastructure Observability

Every InsForge project runs on a dedicated EC2 instance. Previously, you had no way to see what was happening at the infrastructure level — if your app slowed down, you were guessing.

Now, we expose real-time instance metrics through our API, and the CLI makes them accessible in one command:

bash
npx @insforge/cli diagnose metrics --range 1h

This gives you:

  • CPU utilization — is your instance compute-bound?
  • Memory usage — are you running out of RAM?
  • Disk I/O — is storage a bottleneck?
  • Network throughput — are you saturating bandwidth?

You can adjust the time range (1h, 6h, 24h) to spot trends. For example, if your app gets slow every day at 2pm, --range 24h will show you the pattern.


2. Backend Advisor

We've added a Backend Advisor that automatically scans all active projects daily across three dimensions:

  • Security — exposed credentials, missing RLS policies, permissive auth configs
  • Performance — unindexed queries, table bloat, connection pool pressure
  • Health — disk usage, service availability, configuration drift

Each scan produces actionable findings with severity levels (critical / warning / info), so you know what to fix first.

json
//$ npx @insforge/cli diagnose advisor --json
{
  "issues": [
    {
      "id": "3177a9de-2df7-4828-9548-10382a30a467",
      "ruleId": "security/dangerous-function",
      "severity": "critical",
      "category": "security",
      "title": "Dangerous SECURITY DEFINER function",
      "description": "Function 'schedules.encrypt_headers(p_headers jsonb)' is SECURITY DEFINER and callable by: public. It executes with the owner's privileges, which could escalate access.",
      "affectedObject": "schedules.encrypt_headers(p_headers jsonb)",
      "recommendation": "-- Revoke access from dangerous roles:\nREVOKE EXECUTE ON FUNCTION schedules.encrypt_headers(p_headers jsonb) FROM public;\n-- If the function must remain SECURITY DEFINER, lock down search_path to prevent hijacking:\nALTER FUNCTION schedules.encrypt_headers(p_headers jsonb) SET search_path = '';\n-- Consider converting to SECURITY INVOKER if owner privileges are not needed:\nALTER FUNCTION schedules.encrypt_headers(p_headers jsonb) SECURITY INVOKER;",
      "isResolved": false
    },
    {
      "id": "a4fbceee-6c65-432e-97ef-3d7b008f1dec",
      "ruleId": "performance/missing-fk-index",
      "severity": "warning",
      "category": "performance",
      "title": "Foreign key column missing index",
      "description": "Column 'user_id' on table 'todos' is a foreign key (fk_user_id_auth_users_id) but has no index. JOINs will require full table scans, and ON DELETE CASCADE will acquire a full table lock while scanning for referencing rows — blocking all writes to the table.",
      "affectedObject": "todos.user_id",
      "recommendation": "CREATE INDEX CONCURRENTLY idx_todos_user_id ON todos(user_id);",
      "isResolved": false
    }
  ]
}

Query advisor results from the CLI:

bash
# Show all critical issues
npx @insforge/cli diagnose advisor --severity critical

# Filter by category
npx @insforge/cli diagnose advisor --category security

# Full scan results
npx @insforge/cli diagnose advisor

3. The CLI Command

The new diagnose command brings all diagnostic capabilities together in one place:

text
$ npx @insforge/cli diagnose --help

Usage: insforge diagnose [options] [command]

Backend diagnostics — run with no subcommand for a full health report

Options:
  -h, --help         display help for command

Commands:
  metrics [options]  Display EC2 instance metrics (CPU, memory, disk, network)
  advisor [options]  Display latest advisor scan results and issues
  db [options]       Run database health checks (connections, bloat, index usage, etc.)
  logs [options]     Aggregate error-level logs from all backend sources

Database Health Checks

Beyond metrics and advisor scans, diagnose db gives you real-time database health information:

bash
npx @insforge/cli diagnose db

This runs checks across multiple dimensions:

CheckWhat it tells you
connectionsActive/idle connections, pool utilization
slow-queriesCurrently running queries that exceed duration thresholds
bloatTable/index bloat from dead tuples
sizeDatabase and table sizes
index-usageTables with low index hit rates (sequential scan heavy)
locksBlocked queries waiting on locks
cache-hitBuffer cache hit ratio (low = too many disk reads)

You can run specific checks if you already know where to look:

bash
# Just check for slow queries and lock contention
npx @insforge/cli diagnose db --check slow-queries,locks

# Check if your indexes are being used
npx @insforge/cli diagnose db --check index-usage,cache-hit

Aggregated Error Logs

diagnose logs pulls error-level logs from all backend sources into a single view:

bash
npx @insforge/cli diagnose logs

Instead of checking insforge.logs, postgREST.logs, postgres.logs, and function.logs one by one, this command aggregates errors across all of them — useful when you don't know where the problem is.


4. Agent-Powered Debugging with Skill

CLI commands are powerful, but an AI coding agent can do something humans find tedious: systematically work through a diagnostic checklist, cross-reference multiple log sources, and trace errors across the full stack — from the frontend SDK call all the way to the backend logs.

We've distilled the patterns from three months of community support into an insforge-debug skill that teaches AI agents how to diagnose InsForge projects. The skill covers 9 common scenarios:

ScenarioWhen it triggers
SDK error objects{ data: null, error: { code, message } }
HTTP status anomalies400 / 401 / 403 / 404 / 429 / 500
Edge function failureRuntime errors, timeouts
Slow database queriesQuery hangs, connection pool exhaustion
Auth/authorization issuesLogin fails, JWT expired, RLS denied
Realtime channel problemsWebSocket disconnect, messages lost
Backend performance degradationHigh CPU/memory, all responses slow
Edge function deploy failureDeploy command fails
Frontend deploy failureVercel build/deploy errors

For each scenario, the skill tells the agent exactly which commands to run, in what order, and what information to look for. The agent doesn't need to guess — it follows a proven diagnostic path.

Real-World Example

Here's a real case from yesterday. A developer hit this error while using the MiniMax M2.1 model:

The AI provider returned an empty response. Verify the configured model is available.

The agent loaded insforge-debug, followed the diagnostic path, and traced the issue in under a minute:

text
> I'm getting "The AI provider returned an empty response" when using minimax-m2.1.
> Help me figure out why.

[Agent loads insforge-debug skill]

[Agent checks insforge.logs]
> Found: "No allowed providers are available for the selected model."
>   available_providers: [deepinfra, atlas-cloud, novita, nebius, fireworks, minimax, ...]
>   requested_providers: [xai, openai, alibaba, deepseek, anthropic, ...]

[Agent checks ai.configs table]
> minimax/minimax-m2.1 provider is set to "openrouter"

[Agent reads the routing code]
> chat-service.ts:527 — throws when AI stream returns empty content

Diagnosis:
  This is NOT a frontend code issue. The root cause is in the
  InsForge backend AI Gateway: the provider routing for
  minimax/minimax-m2.1 is misconfigured. In the ai.configs table
  this model's provider is set to "openrouter", but the Gateway's
  internal routing rules don't dispatch the request to a provider
  that actually serves the model — so the upstream returns nothing,
  and the frontend throws on the empty stream.

  Fix: This needs to be handled on the InsForge side — please
  contact InsForge support to correct the gateway routing for
  minimax-m2.1, or switch to a different model in the meantime.

The agent went from error message to root cause — across frontend code, backend logs, and database config — without any manual guidance. This is what the insforge-debug skill enables: the agent already knows the diagnostic playbook.


Getting Started

All of these features are available today. Use the latest CLI and try:

bash
# Full health report
npx @insforge/cli diagnose

# Check specific areas
npx @insforge/cli diagnose metrics --range 24h
npx @insforge/cli diagnose advisor --severity critical
npx @insforge/cli diagnose db --check slow-queries,connections
npx @insforge/cli diagnose logs

For AI-assisted debugging, the insforge-debug skill is published and ready to use. Point your agent at it, describe the problem, and let it run the diagnostic playbook.

We built these tools because debugging shouldn't require guesswork. Whether you're running commands yourself or letting an agent do it, the information is now there — logs, metrics, health checks, and advisor scans — all one command away.

If you run into issues or have feedback, drop by our Discord community. The problems you share there are exactly what drives improvements like this one.