BitterMill

00 / Real Macs. Real models.

See what a Mac can really run.

BitterMill gives you on-demand access to Apple-silicon machines running serious open models, so you can decide what feels fast enough, what fits in memory, and whether local inference is worth it before you buy hardware.

Choose a Mac. Choose a model. Start a session.

01 / Why BitterMill

Benchmarks tell you speed. BitterMill tells you whether you would actually want to use it.

Named Macs

Try models on actual Apple-silicon machine classes instead of abstract benchmark rows.

Choose the Mac you care about and see what it is really like to use.

Live behavior

See cold starts, warm starts, queueing, and responsiveness instead of just a tokens-per-second claim.

The difference between “it runs” and “it feels good” is the whole story.

Clear economics

Pay separately for model load, warm hold, and generation instead of hiding everything inside one vague price.

You can see exactly what convenience and low latency cost.

02 / Use it

From curiosity to conviction in one session.

Run it live

Open the model on the Mac you actually care about.

Choose a model and machine class, then use it directly instead of guessing from secondhand charts.

Compare machines

See what changes when you move up in memory.

Run the same model across 64 GB and 128 GB Macs and see the difference in load, headroom, and feel.

Keep it ready

Hold a good model warm when low latency matters.

A fast warm session feels different from a cold one. BitterMill makes that tradeoff visible and rentable.

03 / Fleet

Choose the class of Mac you want to understand.

Available now

M4 Max Mac Studio

64 GB unified memory

The practical desktop test: how far can a serious Mac go before you need more memory or a different class of machine?

Available now

M5 MacBook

128 GB unified memory

The high-headroom test: best for larger open models, more ambitious workloads, and finding out what happens when memory stops being the first limit.

Coming next

More Apple silicon classes

The fleet expands over time, so you can test more of the Apple-silicon range without owning every machine yourself.

04 / Pricing

Pay for the part you actually use.

BitterMill separates setup cost from runtime cost so the economics stay legible.

Load

Spin up the model you want on the machine you chose.

Warm hold

Keep a useful model resident when low latency matters.

Generation

Pay for the actual run, not for folklore around the infrastructure.

Priority

Move faster when you need guaranteed attention from scarce machine memory.

05 / Access

Request early access.

Tell us what you want to run and what Mac question you need answered.

Access is approved manually while the fleet grows.