Currently downloading the model, which is used for inference, can be expensive. For example, downloading Llama 3.3 70B takes over an hour (still running, so it might be multiple hours even). If preparation steps such as this weren't billed, and only the actual compute was, that would be great.