No Such Thing as a Free Lunch

May 5
6 min read

Updated: 7 days ago

For the last couple of years, AI coding tools have felt like an unusually good deal. For a relatively small monthly cost, developers could use powerful models, large context windows, coding agents, code review tools, and high-reasoning workflows. Even the more expensive models often felt cheap compared to the value they could produce.

That pricing model shaped behavior. When the cost difference was small enough to ignore, the rational move was obvious: use the best model, turn reasoning up, give it plenty of context, and ask it to do the thing. That pricing cushion is getting thinner, and the habits that made sense under cheap access are going to need a second look.

AI companies are moving more pricing toward actual usage. GitHub Copilot is moving to usage-based billing with AI Credits. OpenAI has moved Codex pricing toward API token usage. Microsoft is offering more pay-as-you-go Copilot options. Across the market, the direction is becoming more clear: AI usage is being metered more closely to the compute it consumes.

AI will still very much be a strong investment. Engineering leaders need to start paying attention now to how AI is being used inside development teams.

The subsidy phase trained expensive habits

Pricing is only part of the shift. Developers learned AI during a period where the cost signal was weak.

If the most capable model was available and the bill did not change much, why would a developer choose a smaller model? If high reasoning produced better answers and did not feel expensive, why turn it down? If the tool could accept a large context window, why spend extra time narrowing the context?

Those choices made sense under the old economics. Under usage-based pricing, the same habits can get expensive.

Small choices around prompts, agents, context, and model selection can add up quickly once usage is metered. The tricky part is that waste often looks like productivity while the developer is in flow.

Better usage beats blanket restriction

A bad management response would be telling developers to stop using AI so much. High AI usage can be a good sign. A developer with heavy AI usage may be moving faster, generating more tests, reviewing more code, debugging more effectively, and shipping more valuable work. In that case, the usage may be worth every dollar.

Unexplained usage is where managers should pay attention. One developer may be using a more expensive model because they are working through a difficult migration. Another may be burning tokens because they are asking vague questions, retrying prompts, and handing the agent too much context. Those situations should not be treated the same way.

Before asking teams to cut usage, leaders need to understand what normal usage looks like.

Start by figuring out what you actually have

Start with the current state. That sounds obvious, but many organizations skipped this step because AI started as a small subscription problem. A few Copilot licenses. A few Cursor users. Some ChatGPT seats. Maybe Claude. Maybe API usage in Azure OpenAI or another cloud account. Maybe an agent pilot running inside one team.

Then, quietly, it became part of the software delivery process.

The first move is inventory: identify which AI coding tools are in use, who owns them, how many seats are active, which teams use each tool, and the total monthly spend. Is the spend centralized, or is it scattered across procurement, cloud accounts, team budgets, personal subscriptions, and expense reports?

Do the right people even know how much the company is spending on AI-assisted development? That is where I would start. You cannot build an AI usage strategy from vibes and invoice fragments.

Establish a baseline before making policy

That inventory should lead directly into a baseline: normal usage by team, role, project, or workflow. What does a typical month look like? Which tools are actually being used? Which ones are paid for but mostly idle? Where are the outliers?

Outliers are where the baseline becomes useful. Maybe one developer has high usage because they are extremely effective with AI. Maybe a team has high usage because they are deep in a test-generation push or a legacy modernization effort. Maybe another team has high usage because their workflow is poorly defined and their agents are wandering.

You will not know until you look. This is similar to cloud cost management. Teams get better results when they understand usage patterns, connect spend to outcomes, and make better decisions.

AI development cost is becoming an engineering operations problem, not just a procurement problem.

Tool sprawl will make this harder

There is another issue managers will need to deal with: every developer wants a different tool.

Some people prefer Copilot because it sits naturally in their IDE. Some prefer Cursor. Some like Claude for planning. Some like ChatGPT for debugging. Some want the latest agent tool because it feels more powerful for their workflow. Those preferences can be valid, but company-level tool sprawl creates real problems.

It fragments billing. It fragments governance. It makes training harder. It makes usage harder to compare. It makes security and data handling harder to reason about. It also makes it nearly impossible to answer a basic question: what are we actually getting for what we are spending?

At some point, companies will have to make a call. Pick the primary tool or platform that makes the most sense for the organization. Standardize where you can. Train people well. Accept that not everyone will be happy.

Developer preference matters, but it cannot be the only input. The company also has to care about security, supportability, usage visibility, procurement, model access, and cost control.

Many of these tools are converging around the same core patterns: chat, autocomplete, code edits, multi-file changes, repository-aware context, pull reques

t help, and agentic execution. The differences that matter tend to be practical: model access, IDE integration, admin controls, data handling, reporting, and how well the tool fits the company’s workflow.

Companies should be careful about paying for five overlapping tools just because every developer has a personal favorite.

Developers need AI usage training

The next step is practical training, not generic “AI is the future” material or prompt engineering theater.

Developers need to understand model choice. They need to know when a smaller model is enough and when a frontier model is worth the cost. Right now, a lot of developers do not trust smaller models. Bigger feels safer. Under flat pricing, that was easy to accept. Under usage-based pricing, it becomes expensive.

A simple model-selection guide can help. Use smaller or cheaper models for simple explanations, boilerplate, documentation drafts, basic unit tests, formatting, syntax conversion, and summarizing small pieces of code. Use stronger models for ambiguous debugging, architecture tradeoffs, multi-file refactors, unfamiliar frameworks, performance-sensitive work, security-sensitive review, and tasks where being wrong is expensive.

The same applies to reasoning. Higher reasoning is useful when the problem actually requires deeper analysis: comparing approaches, finding edge cases, planning a migration, debugging a weird failure, or reviewing a design for failure modes. It is probably not needed to rename a method, summarize a function, write a simple fixture, or format a response.

Model selection needs to become part of engineering judgment.

Context is not free either

Context discipline is another habit teams will need to build. Large context windows are useful, but they can hide waste.

If a developer gives the model an entire repo when the task only needs a file, the model has to process a lot of irrelevant material. That can increase cost, slow responses, and sometimes make the answer worse because the model has more noise to sort through.

Start with the files, error messages, and constraints the model actually needs. Add context when the answer shows it needs more. That might mean pointing it to the relevant files, explaining the specific failure, including the error message, naming the constraints, and telling it what not to change.

A clear prompt is often cheaper than a vague prompt because it reduces exploration, retries, and unnecessary output. A vague “do the thing” prompt is convenient, but it can get expensive when the model has to infer the task, explore the repo, and retry.

Managers do not need to micromanage every prompt

This is where the message can get distorted. Managers do not need to review every prompt or turn AI usage into a surveillance exercise and hold developers feet to the fire. That would be a fast way to make developers hate the whole program.

AI usage needs the same operating discipline teams already be applying to other areas of their software development lifecycle (SDLC). Set a baseline. Monitor cost. Look at outliers. Understand which usage is producing value and which usage is waste. Create guidance for model choice, reasoning level, context size, and agent workflows. Reduce tool sprawl where it makes sense. Teach developers how pricing works.

There is going to be a lot of figuring it out. The best practices are still forming. Different teams will have different usage patterns. Some expensive workflows will be worth it. Some will not.

They will avoid the two easy extremes: banning the most expensive models outright or letting every developer use every tool however they want with no visibility.

Takeaway

AI coding tools are becoming part of how software gets built. That is exactly why the cost model matters.

So start with the basics: know what tools are in use, who is using them, what they cost, and where the outliers are.

Once you have that visibility, the next moves get a lot easier: set a baseline, train the team, and make model choice part of normal engineering judgment.

✌️Steven