so anthropic dropped the batch processing pricing and i've been running numbers on it vs what we're paying now for sonnet calls. turns out if you're not in a hurry it's stupid cheap. like, stupid cheap
i'm in the middle of a k8s migration and we're using claude to parse logs, validate configs, and generate manifests. nothing that needs sub-second latency. we've been calling sonnet through the regular api and the bill keeps getting worse even after we supposedly optimized things
batch processing cuts the price by like 50% on top of the already lower batch rates. so for log analysis and config validation work we're looking at maybe 1/4 what we pay now. the catch is 24 hour turnaround on results but that's fine for most of what we do
the thing that got me though is how many teams probably don't even know this exists or think it's only for like, massive data processing jobs. nope. it's great for any async work. we could spin up a batch job at 5pm, results come back by 9am next day, feeds into our pipeline. literally saves thousands a month on the infrastructure side alone
i think a lot of people are still using llms like they're paying per-second of thinking time. they're not. if you can batch it, your costs drop off a cliff. we switched 80% of our validation work to batch last week and the impact is already showing up
the timing is kind of funny because our aws bill keeps creeping up even after i thought we'd nailed everything. turns out the ai calls were like 15% of the monthly damage and i wasn't even tracking it properly. now that we're routing everything through batch that doesn't need immediate results, it's actually getting better
anyway if you're doing ops work or infrastructure stuff and you're not already using batch apis, go check it out. doesn't have to be anthropic but the pricing on theirs is honestly pretty aggressive right now