-
We charge you for the build process where we set up your model environment. In this step, we set up a Python environment according to your parameters before downloading and installing the required apt packages, Conda and Python packages as well as any model files you require.
You are only charged for a build if we need to rebuild your environment, ie: you have run abuildordeploycommand and have changed your requirements, parameters or code. Note that we cache each of the steps in a build so subsequent builds will cost substantially less than the first. - The model runtime. This is the amount of time it takes your code to run from start to finish on each request. There are 3 costs to consider here:
- Cold start: This is the amount of time it takes to spin up a server(s), load your environment, connect storage etc. This is part of the Cerebrium service and something we are working on every day to get as low as possible. We do not charge you for this!
- Model initialization: This part of your code is outside of the predict function and only runs when your model incurs a cold start. You are charged for the amount of time it takes for this code to run. Typically this is loading a model into GPU RAM.
- Predict runtime: This is the code stored in your predict function and runs every time a request hits your endpoint
- 24 GB VRam (A5000): $0.000356 per second
- 2 CPU cores: 2 * $0.0000532 per second
- 20GB Memory: 20 * $0.00000659 per second
- 10 GB persistent storage: 10 * $0.3 per month