api.ollama.com

免费大模型额度 · 免费模型 · 社区口碑 · 风险评估

网站类型官方站(免费层)
状态可用
平台openai-compatible
口碑分78 / 100
风险等级
置信度82%
发现来源mnfst/awesome-free-llm-apis

网站介绍

Ollama 是知名的本地大模型运行时(开源 CLI/桌面端);api.ollama.com 指向其官方 Ollama Cloud 服务,把大模型卸载到 Ollama 自有云端运行,提供 Free/Pro($20)/Max($100) 三档,免费档可用但额度很小。

免费额度

免费档 $0,仅限轻度使用、1 个并发云模型;会话限额每 5 小时重置、周限额每 7 天重置,按 GPU 时间而非固定 token 计量。社区反馈免费额度很小(约相当于 25 万 token 输入级别即可能暂停),仅够小规模试用;付费 Pro($20/月) 给约 50 倍用量、3 并发,Max($100/月)10 并发。注意 Ollama 已多次下调/调整限额。

免费模型

gpt-oss:20bgpt-oss:120bdeepseek-v3.1:671bqwen3-coder:480bkimi-k2glm-4.6minimax-m2gemma3 系列

优点

缺点

风险点

社区口碑综述

Ollama 是广受认可的官方一方品牌,Ollama Cloud 为其付费云服务并附带很小的免费档,定位清晰、文档与定价公开透明。社区主要批评集中在免费档限额偏紧且多次收紧、云模型 token 上限争议,以及对部分模型(Qwen3.5)经阿里端点路由是否符合其零数据保留承诺的隐私质疑(官方未明确回应)。整体属低风险的大厂服务,但免费层对追求免费大模型的用户价值有限。

使用建议

作为官方一方服务可放心试用,适合用 OpenAI 兼容 API 临时体验云端大模型;但免费档额度很小,仅够小规模验证,长期/重度使用需升级到 $20 Pro。处理敏感数据前留意上游路由(如 Qwen3.5 经阿里)的隐私争议;以官网 ollama.com/pricing 实时限额为准。

最近探测

可达是 (HTTP 200)
延迟1431 ms
平台openai-compatible
模型端点35(公开)

社区提及 (9)

Hacker News (id=45774248) · negative
ollama cloud is likely to increase prices and/or decrease usage limit levels soon; the free offering has more restrictions than their $20/mo tier. Free tier gives roughly 250k tokens input before usage pauses — enough to try something very small.
GitHub issue #14279 · negative
Alibaba is currently the only endpoint available for Qwen3.5, and Alibaba retains prompts and responses — are Ollama Cloud users being routed to Alibaba APIs despite the privacy claims? (无官方明确回应)
Ollama Blog - Cloud models · positive
Cloud models (preview, 2025-09-19): qwen3-coder:480b-cloud, gpt-oss:120b-cloud, gpt-oss:20b-cloud, deepseek-v3.1:671b-cloud. Ollama's cloud does not retain your data.
Ollama Pricing (官方) · neutral
Free $0 (Light usage, 1 concurrent cloud model), Pro $20/mo (50x more than Free, 3 concurrent), Max $100/mo (10 concurrent). Session limits reset every 5h, weekly limits every 7 days.
Hacker News (id=45774248) · negative
ollama cloud is likely to increase prices and/or decrease usage limit levels soon; the free offering has more restrictions than their $20/mo tier. Free tier gives roughly 250k tokens input before usage pauses.
GitHub issue #14279 · negative
Alibaba is the only endpoint for Qwen3.5 and Alibaba retains prompts and responses — are Ollama Cloud users being routed to Alibaba APIs despite the privacy claims? (no clear official response)
GitHub issue #13089 · negative
New and frustrating very limiting max tokens on the cloud models to only 16,384.
Ollama Pricing (官方) · neutral
Free $0 (light usage, 1 concurrent cloud model), Pro $20/mo (50x more than Free, 3 concurrent), Max $100/mo (10 concurrent). Session limits reset every 5h, weekly limits every 7 days.
Ollama Blog - Cloud models · positive
Cloud models (preview): qwen3-coder:480b-cloud, gpt-oss:120b-cloud, gpt-oss:20b-cloud, deepseek-v3.1:671b-cloud. Ollama's cloud does not retain your data.

参考来源

In English

Summary: Ollama is the well-known local LLM runtime (open-source CLI/desktop app); api.ollama.com points to its official Ollama Cloud service, which offloads large models to Ollama's own cloud. It offers three tiers — Free / Pro ($20) / Max ($100) — and the free tier is usable but with very small allowances.

Free quota: Free tier $0, limited to light usage and 1 concurrent cloud model; session limits reset every 5 hours and weekly limits every 7 days, metered by GPU time rather than a fixed token count. Community feedback says the free allowance is very small (roughly a 250k-input-token level before it pauses), enough only for small trials; paid Pro ($20/mo) gives ~50x usage and 3 concurrency, Max ($100/mo) 10 concurrency. Note Ollama has revised/tightened limits several times.