AI Gateway: Advanced Features and New Models

07 Mar 20264 minute
Junwen Feng

Junwen Feng

Engineer

InsForge AI Gateway advanced features

InsForge AI Gateway has always made it easy to call language models from your backend — a single API endpoint, no provider SDK required. Today we're shipping a major upgrade: five advanced features and a wave of new frontier models.

New Advanced Features

Let your AI responses reference live information from the web. Enable webSearch in any chat completion request, and the gateway will fetch relevant results before generating a response.

bash
curl -X POST "https://your-app.insforge.app/api/ai/chat/completion" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Tell me about SpaceX launches"}],
    "webSearch": {
      "enabled": true,
      "maxResults": 3,
      "searchPrompt": "Here are some relevant web search results about SpaceX:"
    }
  }'

You control the number of results and can customize the search prompt injected into the context.

File Parser

Send PDFs, Word documents, and other files directly in your chat messages. The gateway extracts text content and passes it to the model — no preprocessing on your end.

bash
curl -X POST "https://your-app.region.insforge.app/api/ai/chat/completion" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Please summarize the content of this PDF document."},
        {"type": "file", "file": {"filename": "sample.pdf", "file_data": "https://example.com/sample.pdf"}}
      ]
    }],
    "fileParser": {
      "enabled": true,
      "pdf": {"engine": "pdf-text"}
    }
  }'

Tool Calling

Define functions that the model can invoke during a conversation. The gateway handles the tool-call protocol across providers — same request format whether you're using OpenAI, Anthropic, or Google models.

bash
curl -X POST "https://your-app.region.insforge.app/api/ai/chat/completion" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": "Your prompt here"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

Audio Input

Pass audio data directly in your messages for transcription, analysis, or voice-driven interactions — no separate transcription step needed.

Embeddings

Generate vector embeddings for text inputs. Pass a single string or an array of strings and get back embedding vectors — ready for semantic search, clustering, or RAG pipelines.

bash
curl -X POST "https://your-app.region.insforge.app/api/ai/embeddings" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-small",
    "input": ["Hello, world!", "How are you?", "This is a test."]
  }'

New Models

We've added support for the latest frontier models across four providers:

ProviderModels
OpenAIgpt-5.4, gpt-5.2
Anthropicclaude-opus-4.6, claude-sonnet-4.6
Googlegemini-3.1-pro
xAIgrok-4

All models are available through the same unified API — just change the model parameter. No SDK swaps, no config changes.

How It Works

Every feature works through the same /api/ai/chat/completion endpoint you already use. Add the feature-specific fields to your request body and the gateway handles the rest:

  • "webSearch": {"enabled": true} — attach live search results
  • "fileParser": {"enabled": true} — extract and pass file content
  • "tools": [...] — enable function calling
  • Use /api/ai/embeddings for vector embeddings

All of these features are fully supported in our TypeScript, Kotlin, and Swift SDKs — so you can use them natively in your web, Android, or iOS app without dropping down to raw HTTP calls. All requests are authenticated with your existing JWT token or anon key, and usage is tracked and billed through your project's AI credits.