大模型 API 落地实战：从选型到上线的完整路径

为什么要自己接 API

AI 聊天产品（ChatGPT、Kimi、豆包）很好用，但如果你想做：

产品内的智能问答
代码审查机器人
文档自动生成
多模型对比和路由

你就需要直接调 API，而不是依赖别人的聊天界面。

主流大模型 API 对比

模型	提供方	价格（输入/输出每 1M token）	中文能力	代码能力	特点
Claude 4 Sonnet	Anthropic	$3/$15	⭐⭐⭐	⭐⭐⭐⭐⭐	代码最强、长上下文
GPT-5	OpenAI	$5/$15	⭐⭐⭐	⭐⭐⭐⭐	综合最强、生态完善
DeepSeek V3	深度求索	¥1/¥2	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	性价比之王、中文第一
通义千问 2.5	阿里	¥2/¥6	⭐⭐⭐⭐	⭐⭐⭐	阿里生态、企业友好
文心一言 4.0	百度	¥30/¥60	⭐⭐⭐⭐	⭐⭐⭐	百度生态（偏贵）
Gemini 2.5 Pro	Google	$2.5/$10	⭐⭐⭐	⭐⭐⭐⭐	多模态强、免费额度大

价格按 2026 年 5 月，各平台定价频繁变动。

实践一：OpenAI 兼容接口

大部分国产模型现在都兼容 OpenAI 的接口格式，这意味着你写一套代码可以切换不同模型：

// ai-client.ts —— OpenAI 兼容客户端
const AI_CLIENTS = {
  openai: {
    baseURL: 'https://api.openai.com/v1',
    apiKey: process.env.OPENAI_API_KEY,
  },
  deepseek: {
    baseURL: 'https://api.deepseek.com/v1',
    apiKey: process.env.DEEPSEEK_API_KEY,
  },
  qwen: {
    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
    apiKey: process.env.QWEN_API_KEY,
  },
}

async function chat(
  provider: keyof typeof AI_CLIENTS,
  model: string,
  messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string }>,
) {
  const client = AI_CLIENTS[provider]
  const res = await fetch(`${client.baseURL}/chat/completions`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${client.apiKey}`,
    },
    body: JSON.stringify({
      model,
      messages,
      temperature: 0.7,
      max_tokens: 2048,
    }),
  })
  return res.json()
}

// 使用示例
const reply = await chat('deepseek', 'deepseek-chat', [
  { role: 'system', content: '你是一个 Flutter 开发助手。' },
  { role: 'user', content: '如何实现文件下载进度条？' },
])

换模型只需要改 provider 和 model 参数。

实践二：流式输出

用户不可能等 10 秒看一个完整的回复：

async function chatStream(
  provider: keyof typeof AI_CLIENTS,
  model: string,
  messages: Array<{ role: string; content: string }>,
  onChunk: (text: string) => void,
) {
  const client = AI_CLIENTS[provider]
  const res = await fetch(`${client.baseURL}/chat/completions`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${client.api_KEY}`,
    },
    body: JSON.stringify({
      model,
      messages,
      stream: true,  // 关键参数
    }),
  })

  const reader = res.body!.getReader()
  const decoder = new TextDecoder()

  while (true) {
    const { done, value } = await reader.read()
    if (done) break

    const text = decoder.decode(value)
    const lines = text.split('\n').filter(line => line.startsWith('data: '))

    for (const line of lines) {
      const data = line.replace('data: ', '')
      if (data === '[DONE]') return
      const json = JSON.parse(data)
      const content = json.choices[0]?.delta?.content
      if (content) onChunk(content)
    }
  }
}

// 使用
await chatStream('deepseek', 'deepseek-chat', messages, (chunk) => {
  process.stdout.write(chunk)  // 逐字输出
})

前端可以用 EventSource 或 fetch + ReadableStream。配合 SSE（Server-Sent Events），体验直接拉满。

实践三：多模型路由

单一模型可能在某些任务上翻车，做一个简单路由：

// 简单路由：按任务类型选择模型
function routeModel(task: string) {
  if (task === 'code_review' || task === 'debug') {
    return { provider: 'deepseek', model: 'deepseek-chat' }
  }
  if (task === 'creative_writing' || task === 'article') {
    return { provider: 'deepseek', model: 'deepseek-reasoner' }
  }
  if (task === 'translation') {
    return { provider: 'deepseek', model: 'deepseek-chat' }
  }
  // 默认
  return { provider: 'deepseek', model: 'deepseek-chat' }
}

// 高级版：失败自动降级
async function smartChat(messages: Array<{ role: string; content: string }>) {
  const providers = ['deepseek', 'qwen', 'openai']

  for (const provider of providers) {
    try {
      return await chat(provider, 'auto', messages)
    } catch (err) {
      console.warn(`${provider} 失败，尝试下一个`, err)
    }
  }

  throw new Error('所有模型均不可用')
}

实践四：Token 计数与成本控制

每次调用都计费，不加控制月底账单会吓人：

// 使用 tiktoken 或 gpt-tokenizer 库估算
import { encode } from 'gpt-tokenizer'

function estimateCost(model: string, messages: Array<{ content: string }>) {
  const inputTokens = messages.reduce(
    (sum, m) => sum + encode(m.content).length, 0
  )

  const prices: Record<string, { input: number; output: number }> = {
    'deepseek-chat':    { input: 1, output: 2 },    // ¥/1M tokens
    'qwen-turbo':       { input: 2, output: 6 },
    'gpt-5':            { input: 35, output: 105 },  // ¥换算
  }

  const price = prices[model]
  const inputCost = (inputTokens / 1_000_000) * price.input
  // 预估输出 token（通常为输入的 1.5 倍）
  const outputCost = (inputTokens * 1.5 / 1_000_000) * price.output

  return {
    inputTokens,
    estimatedOutputTokens: inputTokens * 1.5,
    estimatedCost: inputCost + outputCost,
  }
}

// 使用前评估
const cost = estimateCost('deepseek-chat', messages)
if (cost.estimatedCost > 0.5) {
  console.warn(`本次调用预估费用 ¥${cost.estimatedCost.toFixed(3)}，请确认`)
}