How Snowflake Cortex Billing Works
Unlike warehouse compute which bills by the second, Cortex AI functions bill per token processed. Tokens represent chunks of text, roughly 4 characters or 0.75 words for English content. A single AI_COMPLETE call on a paragraph may process 200-500 tokens. At scale, this adds up quickly. Different Cortex functions have different credit rates.
Cortex AI Functions and Their Cost Implications
Snowflake offers several Cortex functions, each with distinct cost characteristics. AI_COMPLETE / CORTEX.COMPLETE is the most commonly used LLM inference function. Cost depends on model choice and prompt length. AI_EMBED / CORTEX.EMBED_TEXT generates vector embeddings for semantic search. Costs scale with document count. Cortex Search charges per search request and for indexing. Cortex Analyst translates natural language to SQL, consuming one or more LLM calls per query. Document AI extracts from unstructured documents with high per-document credit cost.
Where Cortex Costs Spike Unexpectedly
The most common sources of unexpected Cortex spend are embedding pipelines that re-process entire datasets on every run, AI_COMPLETE calls inside SQL loops for each row in a large table, Cortex Search indexes that rebuild too frequently, and experimentation workloads that never got cleaned up after the prototype phase.
How to Track Cortex Credit Consumption
Query ACCOUNT_USAGE.CORTEX_FUNCTIONS_USAGE_HISTORY to see Cortex credit consumption by function type, user, and time period. Join with QUERY_HISTORY to understand which queries triggered the Cortex calls. Set up weekly monitoring queries that compare Cortex spend week-over-week.
Governance Patterns for Cortex AI Spend
Establish these controls before scaling Cortex workloads: require team approval before switching to larger LLM models, store embeddings in a table and only re-embed changed records, batch AI calls during off-peak hours instead of calling functions inline with user queries, run AI experiments on dedicated warehouses with resource monitors, and require cost impact analysis before any Cortex workload moves to production.
Monitor Cortex AI spend automatically with Anavsan
APEX detects Cortex credit anomalies, attributes spend by function and team, and alerts before AI experimentation costs spiral out of control.