Penny-Pincher Provider

A curated list of affordable (or almost free) LLM providers for people who are unwilling to pay premium prices for AI.

Note: Claude is arguably the best AI coding assistant out there — but it’s expensive. If you want to try it before committing, you can use a guest pass for 1 week of free Claude Pro (includes Claude Code).

Guest passes are limited and available on a first-come, first-served basis. New users only — you’ll need to enter payment info to activate, but you can cancel before the trial ends to avoid charges. Learn more.

My passes:

Friend’s passes:

Got a spare Claude Code guest pass? You can help others try Claude Code for free by sharing your pass here — open a pull request adding your link under the Friend’s passes section, or open an issue and I’ll add it for you. Passes are first-come, first-served, so the more we pool together, the better.

If you know of other providers, feel free to create a pull request or open an issue here. I will review and add them when possible. Thank you!

If you find this helpful, you can support me by donating or registering with my referral links. Thank you!

Providers with Coding Plans

These providers offer coding plan subscriptions. You can prepay a monthly fee and use their LLM APIs with usage limits.

Z.ai

Offers 3 plans (starting from $18/month):

  1. Lite: 3x usage of the Claude Pro plan
  2. Pro: 5x Lite plan usage
  3. Max: 4x Pro plan usage

Prices vary based on plan duration (monthly, quarterly, or yearly) and occasional promotional offers, so check the website for current pricing.

Notice (Apr 30, 2026): Auto-renewal on legacy plans (no weekly limit version) is being disabled. Affected users receive 2 months gift on the new equivalent plan.

Models: GLM-5.1, GLM-5-Turbo. OpenAI-compatible API + Anthropic-compatible endpoint.

Source: https://z.ai/subscribe, https://docs.z.ai/devpack/overview 1

Homepage: https://docs.z.ai/devpack/overview

My referral:

🚀 You’ve been invited to join the GLM Coding Plan! Enjoy full support for Claude Code, Cline, and 20+ top coding tools — starting at just $18/month. Subscribe now and grab the limited-time deal! Link: https://z.ai/subscribe?ic=PLKIAYEIPW

MiniMax

Offers a Token Plan — a unified subscription for multimodal AI (text, speech, video, image, music). Pricing is based on API calls, not tokens — very generous!

Plans (monthly):

Includes access to M2.7 language model, Speech 2.8, Image-01, Hailuo video, and Music-2.5. Yearly plans available with ~17% discount.

Source: https://platform.minimax.io/docs/token-plan/intro, https://platform.minimax.io/docs/guides/pricing-token-plan 2

Homepage: https://platform.minimax.io

My referral:

🎁 MiniMax Token Plan New Year Mega Offer! Invite friends and earn rewards for both! Exclusive 10% OFF for friends. Ready-to-use API vouchers for you! Token Plan Referral Program ends May 1, 2026 — referred users get 10% off their subscription and join the dev ambassador community; referrers earn 10% back in API vouchers per paid referral, usable across all MiniMax models, plus priority access to events and model previews. 👉 Get your referral link: https://platform.minimax.io/subscribe/token-plan?code=CAQ5sxHAq6&source=link

Kimi Code

A coding-focused perk included with Kimi membership — drops into any dev workflow (terminal, IDE, or Kimi CLI) and is backed by Moonshot’s Kimi K-series models, which are sharply priced per token.

Tiers:

Usage quotas tracked on a rolling 5-hour window and scale with membership tier. Pay-as-you-go API also available via platform.moonshot.ai.

Source: https://www.kimi.com/code, https://www.kimi.com/membership/pricing 3

Homepage: https://www.kimi.com/code

Alibaba Cloud Model Studio — Coding Plan

Monthly subscription for AI coding tools — top Qwen/Kimi/GLM/MiniMax models at fixed, predictable pricing.

Pro plan: $50/month

Models include qwen3.5-plus, qwen3-max, qwen3-coder, kimi-k2.5, glm-5, and MiniMax-M2.5.

Supported tools: Claude Code, Cursor, Cline (VS Code), OpenCode, Qwen Code, Kilo Code, Kilo CLI, OpenClaw, Codex, and more.

Note: the Lite plan stopped accepting new subscriptions on Mar 20, 2026.

Source: https://www.alibabacloud.com/help/en/model-studio/coding-plan 4

Homepage: https://www.alibabacloud.com/product/modelstudio

BytePlus ModelArk — Coding Plan

ByteDance’s ModelArk coding subscription — flat monthly fee, works with mainstream coding tools, models swappable per task.

Standard plans:

Models include latest ByteDance-Seed-2.0-pro/lite, DeepSeek-V3.2, GLM-5.1, GLM-4.7, Kimi-K2.5, and GPT-OSS variants.

Supported tools: Claude Code, Cursor, Cline (VS Code), Kilo Code, Roo Code, OpenCode, TRAE, and more.

Note: new-user first-purchase promo pricing was suspended on Mar 17, 2026 — everyone now pays the list price.

Source: https://www.byteplus.com/en/activity/codingplan, https://docs.byteplus.com/en/docs/ModelArk/1925114 5

Homepage: https://console.byteplus.com/ark

opencode — Go

A subscription tier for the open-source opencode CLI that pools access to ~10 open-source coding models behind one flat price — aimed at developers who want generous request limits without premium-provider fees.

Pricing:

Models include GLM-5.1, GLM-5, Kimi K2.6 (3× quotas through Apr 27), Kimi K2.5, MiMo-V2-Pro/Omni, Qwen3.5/3.6 Plus, MiniMax M2.5/M2.7.

Per-5-hour request limits vary by model tier (≈200 to 10,200).

API portability: The opencode-go API key is portable — works with Claude Code (via oc-go-cc or LiteLLM proxy), Cline (OpenAI-compatible), and any OpenAI-API-compatible client. Model IDs use opencode-go/<model-id> format.

Source: https://opencode.ai/go, https://opencode.ai/docs/go/ 6

Homepage: https://opencode.ai

Synthetic

Run open-source AI models for you in private, secure datacenters.

Privacy-first inference: Synthetic never trains on your data and doesn’t store API prompts or completions.

Pricing:

Models include Kimi K2.5, MiniMax M2.5, GLM 5.1, GLM 4.7 Flash, plus any vLLM-compatible open-source LLM.

OpenAI-compatible — works with Roo, Cline, Octofriend, and any other OpenAI-API-compatible client.

Source: https://synthetic.new/ 7

Homepage: https://synthetic.new

Free Providers

Sorted by attractiveness — biggest recurring free quota, model quality, and lowest friction first.

LongCat AI

Meituan’s open-source LongCat models. API platform in public beta — no paid tier yet.

Free quota (resets daily 00:00 Beijing Time, no rollover):

Both OpenAI-compatible (https://api.longcat.chat/openai) and Anthropic-compatible (https://api.longcat.chat/anthropic) endpoints. 256K context on most models.

Source: https://longcat.chat/platform/docs/ 8

Homepage: https://longcat.chat/platform

Mistral La Plateforme

Mistral’s developer platform. Free Experiment plan — full model access with rate-limited prototyping quotas.

Free tier:

OpenAI-compatible. Upgrade to Scale plan for production rate limits.

Source: https://docs.mistral.ai/deployment/ai-studio/tier, https://mistral.ai/pricing 9

Homepage: https://console.mistral.ai

Cerebras Cloud

World-fastest LLM inference (wafer-scale chip), OpenAI-compatible. Free tier — no credit card required.

Free tier limits:

Pay-as-you-go tier removes daily and per-minute caps for higher throughput.

Source: https://inference-docs.cerebras.ai/support/rate-limits 10

Homepage: https://www.cerebras.ai/inference

xAI Grok API

xAI’s frontier Grok models — Grok 4, Grok 4.1 Fast (2M context), Grok Code Fast. OpenAI + Anthropic compatible at https://api.x.ai/v1.

Free tier (combined up to $175 in month one):

Pricing (Grok 4.1 Fast): $0.20/M input, $0.50/M output. Server-side tools (web search, code execution) +$5/1K calls.

⚠️ Privacy caveat: Data Sharing opt-in lets xAI train future models on your prompts and responses. Opt-in is irreversible at the team level.

Source: https://docs.x.ai/developers/models, https://x.ai/api 11

Homepage: https://x.ai/api

OpenRouter

Free usage limits: If you’re using a free model variant (with an ID ending in :free), you can make up to 20 requests per minute. The following per-day limits apply:

  • If you have purchased less than 10 credits, you’re limited to 50 :free model requests per day.
  • If you purchase at least 10 credits, your daily limit is increased to 1000 :free model requests per day.

Source: https://openrouter.ai/docs/api/reference/limits 12

Homepage: https://openrouter.ai

Groq

Fast LPU inference, OpenAI-compatible. Free tier with no credit card.

Free tier (per-model, organization-level):

Upgrade to Developer plan for higher RPM/TPD, Batch and Flex processing.

Source: https://console.groq.com/docs/rate-limits 13

Homepage: https://groq.com

GitHub Models

Single-API gateway to OpenAI, Anthropic, Llama, Mistral, DeepSeek, Grok, Phi, and more — free for any GitHub account. OpenAI-compatible at https://models.github.ai/inference.

Free tier:

⚠️ Note: Copilot Pro/Pro+ migrate to usage-based billing on Jun 1, 2026, and new Copilot Pro/Pro+ signups paused since Apr 20, 2026 — monitor changes if relying on Copilot tier limits.

Source: https://docs.github.com/github-models/prototyping-with-ai-models, https://docs.github.com/billing/managing-billing-for-your-products/about-billing-for-github-models 14

Homepage: https://github.com/marketplace/models

NVIDIA NIM

NVIDIA-hosted inference for 50+ open models — free for NVIDIA Developer Program members, no credit card required. OpenAI-compatible API at https://integrate.api.nvidia.com/v1 works out of the box with Cline, Roo, OpenCode, and any OpenAI-compatible client.

Free access:

Paid self-hosted NIM containers and pay-as-you-go API are available for higher throughput; the hosted free tier is fine for evaluation and light coding use.

Source: https://build.nvidia.com, https://developer.nvidia.com/nim 15

Homepage: https://build.nvidia.com

Cloudflare Workers AI

Serverless inference on Cloudflare’s global edge network. Free tier on both Free and Paid Workers plans.

Free allowance:

Requires Cloudflare account API token + Account ID.

Source: https://developers.cloudflare.com/workers-ai/platform/pricing/ 16

Homepage: https://developers.cloudflare.com/workers-ai/

Hugging Face Inference Providers

Routes requests across multiple inference backends (Together, Fireworks, Novita, Cerebras, Replicate, DeepInfra, Scaleway, etc.) behind a single API. OpenAI-compatible.

Free tier:

Sign-up free, no credit card. Other tasks (text-to-image, embeddings, speech) use HF inference clients.

Source: https://huggingface.co/docs/inference-providers/pricing, https://huggingface.co/changelog/inference-providers-openai-compatible 17

Homepage: https://huggingface.co/docs/inference-providers

Google Cloud Vertex AI (free trial credits)

Not a free-forever tier — but new GCP customers get $300 in free credits valid for 90 days, usable across Vertex AI for Gemini 3 Pro/Flash, Anthropic Claude on Vertex, and Vertex Partner models (DeepSeek, GLM, Qwen via MaaS).

Setup:

Vertex AI Express Mode (no billing required): New users can sign up for an Express-mode account with limited free quotas — no credit card needed for evaluation. See https://docs.cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview.

Practical for short-term heavy evaluation; not a long-term free path.

Source: https://cloud.google.com/free, https://cloud.google.com/vertex-ai/generative-ai/pricing 18

Homepage: https://console.cloud.google.com/vertex-ai

DeepSeek Platform

DeepSeek’s official API — flagship V4 / V3.2 / R1 models direct from the source. Notoriously cheap, no credit card to sign up.

Free tier:

Pricing (PAYG):

OpenAI + Anthropic compatible at https://api.deepseek.com.

Source: https://api-docs.deepseek.com/quick_start/pricing 19

Homepage: https://platform.deepseek.com

Scaleway Generative APIs

EU/GDPR-compliant inference hosted in Paris, France. Privacy-first — provider does not log or train on inputs/outputs.

Free tier:

Source: https://www.scaleway.com/en/generative-apis/, https://www.scaleway.com/en/pricing/model-as-a-service/ 20

Homepage: https://console.scaleway.com

Kilo Code — Gateway

Open-source agentic coding extension for VS Code, JetBrains, and CLI. Its built-in Kilo Gateway routes LLM requests to any provider and ships with a genuine free path — no subscription required.

Free access:

Paid Kilo Pass tiers are available for higher throughput on premium models (Starter $19, Pro $49, Expert $199/month), but the free path covers most casual coding use.

Source: https://kilo.ai/pricing, https://kilo.ai/docs/getting-started/using-kilo-for-free, https://kilo.ai/docs/gateway/usage-and-billing 21

Homepage: https://kilo.ai

Pollinations AI

Open-source Gen-AI platform (Berlin) for text, image, audio, and video generation. OpenAI-compatible endpoints.

Free access (post-2026 key migration):

Source: https://github.com/pollinations/pollinations 22

Homepage: https://pollinations.ai

Together AI

Serverless inference for 200+ open-source models (Llama, Qwen, DeepSeek, Mixtral, etc.). OpenAI-compatible — drop-in via base URL change.

Free access:

Source: https://www.together.ai, https://www.together.ai/startup-accelerator 23

Homepage: https://www.together.ai

DeepInfra

Pay-per-token inference for 100+ open-source models. OpenAI-compatible endpoint at api.deepinfra.com/v1/openai.

Free access:

Best for low-cost production traffic, not free-forever.

Source: https://deepinfra.com/docs/deep_infra_api 24

Homepage: https://deepinfra.com

Fireworks AI

Fast inference for 50+ open-source models, plus tooling (function calling, MCP support, response API). OpenAI-compatible.

Free access:

Source: https://fireworks.ai/pricing, https://docs.fireworks.ai/tools-sdks/openai-compatibility 25

Homepage: https://fireworks.ai

Serverless GPU platform for deploying your own LLMs (vLLM, TGI, custom models). Different paradigm: not a pre-hosted LLM API, you bring/deploy the model.

Free tier (Starter plan):

Use case: deploy any open-source LLM as your own OpenAI-compatible endpoint, full control over model + privacy.

Source: https://modal.com/pricing, https://modal.com/blog/how-to-deploy-vllm 26

Homepage: https://modal.com

  1. Checked on Apr 30, 2026 

  2. Checked on Apr 30, 2026 

  3. Checked on Apr 30, 2026 

  4. Checked on Apr 30, 2026 

  5. Checked on Apr 30, 2026 

  6. Checked on Apr 30, 2026 

  7. Checked on Apr 22, 2026 

  8. Checked on Apr 28, 2026 

  9. Checked on Apr 30, 2026 

  10. Checked on Apr 28, 2026 

  11. Checked on Apr 30, 2026 

  12. Checked on Mar 25, 2026 

  13. Checked on Apr 28, 2026 

  14. Checked on Apr 30, 2026 

  15. Checked on Apr 25, 2026 

  16. Checked on Apr 28, 2026 

  17. Checked on Apr 30, 2026 

  18. Checked on Apr 30, 2026 

  19. Checked on Apr 30, 2026 

  20. Checked on Apr 28, 2026 

  21. Checked on Apr 30, 2026 

  22. Checked on Apr 28, 2026 

  23. Checked on Apr 30, 2026 

  24. Checked on Apr 30, 2026 

  25. Checked on Apr 30, 2026 

  26. Checked on Apr 30, 2026