snowflake-platform by jezweb/claude-skills
npx skills add https://github.com/jezweb/claude-skills --skill snowflake-platform使用 snow CLI、Cortex AI 函数、Native Apps 和 Snowpark 在 Snowflake 的 AI 数据云上构建和部署应用程序。
pip install snowflake-cli
snow --version # 应显示 3.14.0+
# 交互式设置
snow connection add
# 或手动创建 ~/.snowflake/config.toml
[connections.default]
account = "orgname-accountname"
user = "USERNAME"
authenticator = "SNOWFLAKE_JWT"
private_key_path = "~/.snowflake/rsa_key.p8"
snow connection test -c default
snow sql -q "SELECT CURRENT_USER(), CURRENT_ACCOUNT()"
在以下情况使用:
不要在以下情况使用:
streamlit-snowflake 技能)广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
Snowflake Cortex 直接在 SQL 中提供 LLM 功能。函数位于 SNOWFLAKE.CORTEX 模式中。
| 函数 | 用途 | GA 状态 |
|---|---|---|
COMPLETE / AI_COMPLETE | 根据提示生成文本 | 2025年11月 GA |
SUMMARIZE / AI_SUMMARIZE | 总结文本 | GA |
TRANSLATE / AI_TRANSLATE | 语言间翻译 | 2025年9月 GA |
SENTIMENT / AI_SENTIMENT | 情感分析 | 2025年7月 GA |
AI_FILTER | 自然语言过滤 | 2025年11月 GA |
AI_CLASSIFY | 对文本/图像分类 | 2025年11月 GA |
AI_AGG | 跨行聚合洞察 | 2025年11月 GA |
-- 简单提示
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
'Explain quantum computing in one sentence'
) AS response;
-- 包含对话历史
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
[
{'role': 'system', 'content': 'You are a helpful assistant'},
{'role': 'user', 'content': 'What is Snowflake?'}
]
) AS response;
-- 包含选项
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'mistral-large2',
'Summarize this document',
{'temperature': 0.3, 'max_tokens': 500}
) AS response;
可用模型:
llama3.1-70b, llama3.1-8b, llama3.2-3bmistral-large2, mistral-7bsnowflake-arcticgemma-7bclaude-3-5-sonnet (200K 上下文)模型上下文窗口(2025年更新):
| 模型 | 上下文窗口 | 最适合 |
|---|---|---|
| Claude 3.5 Sonnet | 200,000 tokens | 大型文档,长对话 |
| Llama3.1-70b | 128,000 tokens | 复杂推理,中等文档 |
| Llama3.1-8b | 8,000 tokens | 简单任务,短文本 |
| Llama3.2-3b | 8,000 tokens | 快速推理,最小文本 |
| Mistral-large2 | 可变 | 查看当前文档 |
| Snowflake Arctic | 可变 | 查看当前文档 |
Token 计算 : ~4 个字符 = 1 token。一个 32,000 字符的文档 ≈ 8,000 tokens。
错误 : Input exceeds context window limit → 使用更小的模型或分块处理输入。
-- 单个文本
SELECT SNOWFLAKE.CORTEX.SUMMARIZE(article_text) AS summary
FROM articles
LIMIT 10;
-- 跨行聚合(无上下文窗口限制)
SELECT AI_SUMMARIZE_AGG(review_text) AS all_reviews_summary
FROM product_reviews
WHERE product_id = 123;
-- 翻译成英语(自动检测源语言)
SELECT SNOWFLAKE.CORTEX.TRANSLATE(
review_text,
'', -- 空值 = 自动检测源语言
'en' -- 目标语言
) AS translated
FROM international_reviews;
-- 指定源语言
SELECT AI_TRANSLATE(
description,
'es', -- 源语言:西班牙语
'en' -- 目标语言:英语
) AS translated
FROM spanish_products;
性能 : 截至 2025年9月,AI_FILTER 包含自动优化,为合适的查询提供 2-10 倍加速和高达 60% 的 token 减少。
-- 使用纯英语过滤
SELECT * FROM customer_feedback
WHERE AI_FILTER(
feedback_text,
'mentions shipping problems or delivery delays'
);
-- 与 SQL 谓词结合以实现最大优化
-- 查询规划器首先应用标准过滤器,然后在较小的数据集上应用 AI
SELECT * FROM support_tickets
WHERE created_date > '2025-01-01' -- 首先应用标准过滤器
AND AI_FILTER(description, 'customer is angry or frustrated');
最佳实践 : 始终将 AI_FILTER 与传统 SQL 谓词(日期范围、类别等)结合使用,以在 AI 处理前减少数据集。这可以最大化自动优化的好处。
限流 : 在高峰使用期间,AI 函数请求可能会被限流并返回可重试的错误。为生产应用程序实现指数退避(参见已知问题 #10)。
-- 分类支持工单
SELECT
ticket_id,
AI_CLASSIFY(
description,
['billing', 'technical', 'shipping', 'other']
) AS category
FROM support_tickets;
Cortex AI 函数根据 token 计费:
大规模成本管理(社区来源):
真实世界生产案例研究显示,一个处理 11.8 亿条记录的 AI_COMPLETE 查询花费了近 5,000 美元的积分。需要注意的成本驱动因素:
-- 这个看似简单的查询在大规模时可能很昂贵
SELECT
product_id,
AI_COMPLETE('mistral-large2', 'Summarize: ' || review_text) as summary
FROM product_reviews -- 10 亿行
WHERE created_date > '2024-01-01';
-- 成本 = (输入 tokens + 输出 tokens) × 行数 × 模型费率
-- 在大规模时,这会快速累积
最佳实践 :
来源 : The Hidden Cost of Snowflake Cortex AI(包含计费证据的社区博客)
关键 : Snowflake 使用两种账户标识符格式:
| 格式 | 示例 | 用于 |
|---|---|---|
| 组织-账户 | irjoewf-wq46213 | REST API URL,连接配置 |
| 账户定位器 | NZ90655 | JWT 声明(iss, sub) |
这些不可互换!
SELECT CURRENT_ACCOUNT(); -- 返回:NZ90655
# 生成私钥(需要 PKCS#8 格式)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out ~/.snowflake/rsa_key.p8 -nocrypt
# 生成公钥
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -out ~/.snowflake/rsa_key.pub
# 获取 JWT 声明的指纹
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -outform DER | \
openssl dgst -sha256 -binary | openssl enc -base64
-- 在 Snowflake 工作表中(需要 ACCOUNTADMIN 或 SECURITYADMIN)
ALTER USER my_user SET RSA_PUBLIC_KEY='MIIBIjANBgkq...';
iss: ACCOUNT_LOCATOR.USERNAME.SHA256:fingerprint
sub: ACCOUNT_LOCATOR.USERNAME
示例:
iss: NZ90655.JEZWEB.SHA256:jpZO6LvU2SpKd8tE61OGfas5ZXpfHloiJd7XHLPDEEA=
sub: NZ90655.JEZWEB
2026年1月新增 : 连接器在 Snowpark Container Services 内部运行时自动检测并使用 SPCS 服务标识符令牌。
# 在 SPCS 容器内无需特殊配置
import snowflake.connector
# 自动检测 SPCS_TOKEN 环境变量
conn = snowflake.connector.connect()
这使得从容器化的 Snowpark 服务进行无缝认证,无需显式凭据。
来源 : Release v4.2.0
# 初始化项目
snow init
# 执行 SQL
snow sql -q "SELECT 1"
snow sql -f query.sql
# 查看日志
snow logs
# 开发
snow app run # 部署并本地运行
snow app deploy # 仅上传到阶段
snow app teardown # 移除应用
# 版本控制
snow app version create V1_0
snow app version list
snow app version drop V1_0
# 发布
snow app publish --version V1_0 --patch 0
# 发布通道
snow app release-channel list
snow app release-channel add-version --channel ALPHA --version V1_0
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
snow streamlit deploy --replace
snow streamlit deploy --replace --open
snow stage list
snow stage copy @my_stage/file.txt ./local/
my_native_app/
├── snowflake.yml # 项目配置
├── manifest.yml # 应用清单
├── setup_script.sql # 安装脚本
├── app/
│ └── streamlit/
│ ├── environment.yml
│ └── streamlit_app.py
└── scripts/
└── setup.sql
definition_version: 2
native_app:
name: my_app
package:
name: my_app_pkg
distribution: external # 用于 marketplace
application:
name: my_app
source_stage: stage/dev
artifacts:
- src: manifest.yml
dest: manifest.yml
- src: setup_script.sql
dest: setup_script.sql
- src: app/streamlit/environment.yml
dest: streamlit/environment.yml
- src: app/streamlit/streamlit_app.py
dest: streamlit/streamlit_app.py
enable_release_channels: true # 用于 ALPHA/BETA 通道
manifest_version: 1
artifacts:
setup_script: setup_script.sql
default_streamlit: streamlit/streamlit_app.py
# 注意:不要包含权限部分 - Native Apps 不能声明权限
调用外部 API 的 Native Apps 需要此设置:
-- 1. 创建网络规则(在真实数据库中,不在应用包中)
CREATE DATABASE IF NOT EXISTS MY_APP_UTILS;
CREATE OR REPLACE NETWORK RULE MY_APP_UTILS.PUBLIC.api_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('api.example.com:443');
-- 2. 创建集成
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION my_app_integration
ALLOWED_NETWORK_RULES = (MY_APP_UTILS.PUBLIC.api_rule)
ENABLED = TRUE;
-- 3. 授予应用权限
GRANT USAGE ON INTEGRATION my_app_integration
TO APPLICATION MY_APP;
-- 4. 关键:附加到 Streamlit(每次部署后必须重复!)
ALTER STREAMLIT MY_APP.config_schema.my_streamlit
SET EXTERNAL_ACCESS_INTEGRATIONS = (my_app_integration);
警告 : 步骤 4 在每次 snow app run 后重置。必须在每次部署后重新运行!
当您的 Native App 需要来自外部数据库的数据时:
-- 1. 在应用包中创建 shared_data 模式
CREATE SCHEMA IF NOT EXISTS MY_APP_PKG.SHARED_DATA;
-- 2. 创建引用外部数据库的视图
CREATE OR REPLACE VIEW MY_APP_PKG.SHARED_DATA.MY_VIEW AS
SELECT * FROM EXTERNAL_DB.SCHEMA.TABLE;
-- 3. 授予 REFERENCE_USAGE(关键!)
GRANT REFERENCE_USAGE ON DATABASE EXTERNAL_DB
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
-- 4. 授予共享访问权限
GRANT USAGE ON SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
在 setup_script.sql 中,引用 shared_data.view_name(而不是原始数据库)。
# 1. 部署应用
snow app run
# 2. 创建版本
snow app version create V1_0
# 3. 检查安全审查状态
snow app version list
# 等待 review_status = APPROVED
# 4. 设置发布指令
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
# 5. 在 Snowsight Provider Studio 中创建列表(仅限 UI)
| 状态 | 含义 | 操作 |
|---|---|---|
NOT_REVIEWED | 扫描未运行 | 检查 DISTRIBUTION 是否为 EXTERNAL |
IN_PROGRESS | 扫描运行中 | 等待 |
APPROVED | 通过 | 可以发布 |
REJECTED | 失败 | 修复问题或申诉 |
MANUAL_REVIEW | 人工审查中 | 等待(可能需要数天) |
触发人工审查 : 外部访问集成、Streamlit 组件、网络调用。
| 字段 | 最大长度 | 说明 |
|---|---|---|
| 标题 | 72 字符 | 应用名称 |
| 副标题 | 128 字符 | 一句话描述 |
| 描述 | 10,000 字符 | HTML 编辑器 |
| 业务需求 | 最多 6 个 | 从下拉列表中选择 |
| 快速开始示例 | 最多 10 个 | 标题 + 描述 + SQL |
| 数据字典 | 必需 | 数据列表的强制要求(2025年) |
---|---
1 | 完整的 Snowflake 账户(非试用版)
2 | ACCOUNTADMIN 角色
3 | Provider Profile 已批准
4 | Stripe 账户已配置
5 | Provider & Consumer 条款已接受
6 | 联系 Marketplace 运营团队
注意 : 无法将免费列表转换为付费列表。必须创建新列表。
from snowflake.snowpark import Session
connection_params = {
"account": "orgname-accountname",
"user": "USERNAME",
"password": "PASSWORD", # 或使用 private_key_path
"warehouse": "COMPUTE_WH",
"database": "MY_DB",
"schema": "PUBLIC"
}
session = Session.builder.configs(connection_params).create()
# 读取表
df = session.table("MY_TABLE")
# 过滤和选择
result = df.filter(df["STATUS"] == "ACTIVE") \
.select("ID", "NAME", "CREATED_AT") \
.sort("CREATED_AT", ascending=False)
# 执行
result.show()
# 收集到 Python
rows = result.collect()
# 错误 - dict() 在 Snowpark Row 上不起作用
config = dict(result[0])
# 正确 - 显式访问列
row = result[0]
config = {
'COLUMN_A': row['COLUMN_A'],
'COLUMN_B': row['COLUMN_B'],
}
2026年1月新增 : SnowflakeCursor.stats 属性为 rowcount 不足的操作(例如 CTAS 查询)暴露了细粒度的 DML 统计信息。
# v4.2.0 之前 - rowcount 对于 CTAS 返回 -1
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.rowcount) # 返回 -1(无帮助!)
# v4.2.0 之后 - stats 属性显示实际行数
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.stats) # 返回 {'rows_inserted': 1234, 'duplicates': 0, ...}
来源 : Release v4.2.0
from snowflake.snowpark.functions import udf, sproc
# 注册 UDF
@udf(name="my_udf", replace=True)
def my_udf(x: int) -> int:
return x * 2
# 注册存储过程
@sproc(name="my_sproc", replace=True)
def my_sproc(session: Session, table_name: str) -> str:
df = session.table(table_name)
count = df.count()
return f"Row count: {count}"
REST API 是从 Cloudflare Workers 进行编程式 Snowflake 访问的基础。
https://{org-account}.snowflakecomputing.com/api/v2/statements
所有请求 必须包含这些头部 - 缺少 Accept 会导致静默失败:
const headers = {
'Authorization': `Bearer ${jwt}`,
'Content-Type': 'application/json',
'Accept': 'application/json', // 必需 - 如果缺少会导致 "null" 错误
'User-Agent': 'MyApp/1.0',
};
即使是简单查询也返回异步(HTTP 202)。始终实现轮询:
// 提交返回 statementHandle,而不是结果
const submit = await fetch(url, { method: 'POST', headers, body });
const { statementHandle } = await submit.json();
// 轮询直到完成
while (true) {
const status = await fetch(`${url}/${statementHandle}`, { headers });
if (status.status === 200) break; // 完成
if (status.status === 202) {
await sleep(2000); // 仍在运行
continue;
}
}
| 计划 | 限制 | 安全轮询 |
|---|---|---|
| 免费 | 50 | 45 次尝试 @ 2秒 = 最多 90秒 |
| 付费 | 1,000 | 100 次尝试 @ 500毫秒 = 最多 50秒 |
Workers fetch() 没有默认超时。始终使用 AbortController:
const response = await fetch(url, {
signal: AbortSignal.timeout(30000), // 30 秒
headers,
});
超时发生时取消查询以避免仓库成本:
POST /api/v2/statements/{statementHandle}/cancel
查看 templates/snowflake-rest-client.ts 获取完整实现。
症状 : JWT 认证静默失败,查询不出现在 Query History 中。
原因 : 在 JWT 声明中使用组织-账户格式而不是账户定位器。
修复 : 使用 SELECT CURRENT_ACCOUNT() 获取实际的账户定位器。
症状 : snow app run 后 API 调用失败。
原因 : 外部访问集成附件在每次部署时重置。
修复 : 每次部署后重新运行 ALTER STREAMLIT ... SET EXTERNAL_ACCESS_INTEGRATIONS。
症状 : ALTER APPLICATION PACKAGE ... SET DEFAULT RELEASE DIRECTIVE 失败。
原因 : 旧版 SQL 语法不适用于启用了发布通道的情况。
修复 : 使用 snow CLI: snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
症状 : 文件出现在 streamlit/streamlit/ 而不是 streamlit/ 中。
原因 : snowflake.yml 中的目录映射嵌套了文件夹名称。
修复 : 在 artifacts 中显式列出单个文件,而不是目录。
症状 : "添加到共享内容的视图不能引用其他数据库中的对象"
原因 : 共享数据缺少 GRANT REFERENCE_USAGE ON DATABASE。
修复 : 使用外部数据库时,在 snow app run 之前始终授予 REFERENCE_USAGE。
症状 : 轮询请求时出现 "Unsupported Accept header null is specified"。
原因 : 初始请求有 Accept: application/json 但轮询请求没有。
修复 : 对所有请求(提交、轮询、取消)使用一致的头部辅助函数。
症状 : Worker 无限期挂起等待 Snowflake 响应。
原因 : Cloudflare Workers 的 fetch() 没有默认超时。
修复 : 在所有 Snowflake 请求上始终使用 AbortSignal.timeout(30000)。
症状 : 轮询期间出现 "Too many subrequests" 错误。
原因 : 每 1 秒轮询一次 × 600 次尝试 = 600 个子请求超过限制。
修复 : 每 2-5 秒轮询一次,限制为 45(免费)或 100(付费)次尝试。
症状 : 查询返回 statementHandle 但永不完成(代码 090001 无限期)。
原因 : 090001 表示 "运行中" 而不是错误。仓库正在恢复,只是需要时间。
修复 : 自动恢复有效。等待更长时间或首先显式恢复:POST /api/v2/warehouses/{wh}:resume
错误 : 长时间运行的 Python 应用程序显示内存随时间增长 来源 : GitHub Issue #2727, #2725 影响 : snowflake-connector-python 4.0.0 - 4.2.0
发生原因 :
SessionManager 使用 defaultdict,这会阻止垃圾回收SnowflakeRestful.fetch() 持有在查询执行期间泄漏的引用# 避免 - 每次迭代创建新连接
for i in range(1000):
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.close()
conn.close()
# 更好 - 重用连接
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
for i in range(1000):
cursor.execute("SELECT 1")
cursor.close()
conn.close()
状态 : 预计在连接器 v4.3.0 或更高版本中修复
错误 : "Request throttled due to high usage. Please retry." 来源 : Snowflake Cortex Documentation 影响 : 所有 Cortex AI 函数(COMPLETE, FILTER, CLASSIFY 等)
发生原因 : AI/LLM 请求在高峰使用期间可能会被限流以管理平台容量。被限流的请求返回错误并需要手动重试。
预防 : 实现带指数退避的重试逻辑:
import time
import snowflake.connector
def execute_with_retry(cursor, query, max_retries=3):
for attempt in range(max_retries):
try:
return cursor.execute(query).fetchall()
except snowflake.connector.errors.DatabaseError as e:
if "throttled" in str(e).lower() and attempt < max_retries - 1:
wait_time = 2 ** attempt # 指数退避
time.sleep(wait_time)
else:
raise
状态 : 已记录的行为,无修复计划
streamlit-snowflake - Snowflake 应用中的 Streamlit每周安装
326
仓库
GitHub 星标
650
首次出现
2026年1月20日
安全审计
安装在
claude-code267
opencode212
gemini-cli212
cursor198
antigravity192
codex185
Build and deploy applications on Snowflake's AI Data Cloud using the snow CLI, Cortex AI functions, Native Apps, and Snowpark.
pip install snowflake-cli
snow --version # Should show 3.14.0+
# Interactive setup
snow connection add
# Or create ~/.snowflake/config.toml manually
[connections.default]
account = "orgname-accountname"
user = "USERNAME"
authenticator = "SNOWFLAKE_JWT"
private_key_path = "~/.snowflake/rsa_key.p8"
snow connection test -c default
snow sql -q "SELECT CURRENT_USER(), CURRENT_ACCOUNT()"
Use when:
Don't use when:
streamlit-snowflake skill)Snowflake Cortex provides LLM capabilities directly in SQL. Functions are in the SNOWFLAKE.CORTEX schema.
| Function | Purpose | GA Status |
|---|---|---|
COMPLETE / AI_COMPLETE | Text generation from prompt | GA Nov 2025 |
SUMMARIZE / AI_SUMMARIZE | Summarize text | GA |
TRANSLATE / AI_TRANSLATE | Translate between languages | GA Sep 2025 |
SENTIMENT / |
-- Simple prompt
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
'Explain quantum computing in one sentence'
) AS response;
-- With conversation history
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
[
{'role': 'system', 'content': 'You are a helpful assistant'},
{'role': 'user', 'content': 'What is Snowflake?'}
]
) AS response;
-- With options
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'mistral-large2',
'Summarize this document',
{'temperature': 0.3, 'max_tokens': 500}
) AS response;
Available Models:
llama3.1-70b, llama3.1-8b, llama3.2-3bmistral-large2, mistral-7bsnowflake-arcticgemma-7bclaude-3-5-sonnet (200K context)Model Context Windows (Updated 2025):
| Model | Context Window | Best For |
|---|---|---|
| Claude 3.5 Sonnet | 200,000 tokens | Large documents, long conversations |
| Llama3.1-70b | 128,000 tokens | Complex reasoning, medium documents |
| Llama3.1-8b | 8,000 tokens | Simple tasks, short text |
| Llama3.2-3b | 8,000 tokens | Fast inference, minimal text |
| Mistral-large2 | Variable | Check current docs |
| Snowflake Arctic | Variable | Check current docs |
Token Math : ~4 characters = 1 token. A 32,000 character document ≈ 8,000 tokens.
Error : Input exceeds context window limit → Use smaller model or chunk your input.
-- Single text
SELECT SNOWFLAKE.CORTEX.SUMMARIZE(article_text) AS summary
FROM articles
LIMIT 10;
-- Aggregate across rows (no context window limit)
SELECT AI_SUMMARIZE_AGG(review_text) AS all_reviews_summary
FROM product_reviews
WHERE product_id = 123;
-- Translate to English (auto-detect source)
SELECT SNOWFLAKE.CORTEX.TRANSLATE(
review_text,
'', -- Empty = auto-detect source language
'en' -- Target language
) AS translated
FROM international_reviews;
-- Explicit source language
SELECT AI_TRANSLATE(
description,
'es', -- Source: Spanish
'en' -- Target: English
) AS translated
FROM spanish_products;
Performance : As of September 2025, AI_FILTER includes automatic optimization delivering 2-10x speedup and up to 60% token reduction for suitable queries.
-- Filter with plain English
SELECT * FROM customer_feedback
WHERE AI_FILTER(
feedback_text,
'mentions shipping problems or delivery delays'
);
-- Combine with SQL predicates for maximum optimization
-- Query planner applies standard filters FIRST, then AI on smaller dataset
SELECT * FROM support_tickets
WHERE created_date > '2025-01-01' -- Standard filter applied first
AND AI_FILTER(description, 'customer is angry or frustrated');
Best Practice : Always combine AI_FILTER with traditional SQL predicates (date ranges, categories, etc.) to reduce the dataset before AI processing. This maximizes the automatic optimization benefits.
Throttling : During peak usage, AI function requests may be throttled with retry-able errors. Implement exponential backoff for production applications (see Known Issue #10).
-- Categorize support tickets
SELECT
ticket_id,
AI_CLASSIFY(
description,
['billing', 'technical', 'shipping', 'other']
) AS category
FROM support_tickets;
Cortex AI functions bill based on tokens:
Cost Management at Scale (Community-sourced):
Real-world production case study showed a single AI_COMPLETE query processing 1.18 billion records cost nearly $5K in credits. Cost drivers to watch:
-- This seemingly simple query can be expensive at scale
SELECT
product_id,
AI_COMPLETE('mistral-large2', 'Summarize: ' || review_text) as summary
FROM product_reviews -- 1 billion rows
WHERE created_date > '2024-01-01';
-- Cost = (input tokens + output tokens) × row count × model rate
-- At scale, this adds up fast
Best Practices :
Source : The Hidden Cost of Snowflake Cortex AI (Community blog with billing evidence)
Critical : Snowflake uses TWO account identifier formats:
| Format | Example | Used For |
|---|---|---|
| Organization-Account | irjoewf-wq46213 | REST API URLs, connection config |
| Account Locator | NZ90655 | JWT claims (iss, sub) |
These are NOT interchangeable!
SELECT CURRENT_ACCOUNT(); -- Returns: NZ90655
# Generate private key (PKCS#8 format required)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out ~/.snowflake/rsa_key.p8 -nocrypt
# Generate public key
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -out ~/.snowflake/rsa_key.pub
# Get fingerprint for JWT claims
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -outform DER | \
openssl dgst -sha256 -binary | openssl enc -base64
-- In Snowflake worksheet (requires ACCOUNTADMIN or SECURITYADMIN)
ALTER USER my_user SET RSA_PUBLIC_KEY='MIIBIjANBgkq...';
iss: ACCOUNT_LOCATOR.USERNAME.SHA256:fingerprint
sub: ACCOUNT_LOCATOR.USERNAME
Example:
iss: NZ90655.JEZWEB.SHA256:jpZO6LvU2SpKd8tE61OGfas5ZXpfHloiJd7XHLPDEEA=
sub: NZ90655.JEZWEB
New in January 2026 : Connector automatically detects and uses SPCS service identifier tokens when running inside Snowpark Container Services.
# No special configuration needed inside SPCS containers
import snowflake.connector
# Auto-detects SPCS_TOKEN environment variable
conn = snowflake.connector.connect()
This enables seamless authentication from containerized Snowpark services without explicit credentials.
Source : Release v4.2.0
# Initialize project
snow init
# Execute SQL
snow sql -q "SELECT 1"
snow sql -f query.sql
# View logs
snow logs
# Development
snow app run # Deploy and run locally
snow app deploy # Upload to stage only
snow app teardown # Remove app
# Versioning
snow app version create V1_0
snow app version list
snow app version drop V1_0
# Publishing
snow app publish --version V1_0 --patch 0
# Release Channels
snow app release-channel list
snow app release-channel add-version --channel ALPHA --version V1_0
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
snow streamlit deploy --replace
snow streamlit deploy --replace --open
snow stage list
snow stage copy @my_stage/file.txt ./local/
my_native_app/
├── snowflake.yml # Project config
├── manifest.yml # App manifest
├── setup_script.sql # Installation script
├── app/
│ └── streamlit/
│ ├── environment.yml
│ └── streamlit_app.py
└── scripts/
└── setup.sql
definition_version: 2
native_app:
name: my_app
package:
name: my_app_pkg
distribution: external # For marketplace
application:
name: my_app
source_stage: stage/dev
artifacts:
- src: manifest.yml
dest: manifest.yml
- src: setup_script.sql
dest: setup_script.sql
- src: app/streamlit/environment.yml
dest: streamlit/environment.yml
- src: app/streamlit/streamlit_app.py
dest: streamlit/streamlit_app.py
enable_release_channels: true # For ALPHA/BETA channels
manifest_version: 1
artifacts:
setup_script: setup_script.sql
default_streamlit: streamlit/streamlit_app.py
# Note: Do NOT include privileges section - Native Apps can't declare privileges
Native Apps calling external APIs need this setup:
-- 1. Create network rule (in a real database, NOT app package)
CREATE DATABASE IF NOT EXISTS MY_APP_UTILS;
CREATE OR REPLACE NETWORK RULE MY_APP_UTILS.PUBLIC.api_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('api.example.com:443');
-- 2. Create integration
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION my_app_integration
ALLOWED_NETWORK_RULES = (MY_APP_UTILS.PUBLIC.api_rule)
ENABLED = TRUE;
-- 3. Grant to app
GRANT USAGE ON INTEGRATION my_app_integration
TO APPLICATION MY_APP;
-- 4. CRITICAL: Attach to Streamlit (must repeat after EVERY deploy!)
ALTER STREAMLIT MY_APP.config_schema.my_streamlit
SET EXTERNAL_ACCESS_INTEGRATIONS = (my_app_integration);
Warning : Step 4 resets on every snow app run. Must re-run after each deploy!
When your Native App needs data from an external database:
-- 1. Create shared_data schema in app package
CREATE SCHEMA IF NOT EXISTS MY_APP_PKG.SHARED_DATA;
-- 2. Create views referencing external database
CREATE OR REPLACE VIEW MY_APP_PKG.SHARED_DATA.MY_VIEW AS
SELECT * FROM EXTERNAL_DB.SCHEMA.TABLE;
-- 3. Grant REFERENCE_USAGE (CRITICAL!)
GRANT REFERENCE_USAGE ON DATABASE EXTERNAL_DB
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
-- 4. Grant access to share
GRANT USAGE ON SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
In setup_script.sql, reference shared_data.view_name (NOT the original database).
# 1. Deploy app
snow app run
# 2. Create version
snow app version create V1_0
# 3. Check security review status
snow app version list
# Wait for review_status = APPROVED
# 4. Set release directive
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
# 5. Create listing in Snowsight Provider Studio (UI only)
| Status | Meaning | Action |
|---|---|---|
NOT_REVIEWED | Scan hasn't run | Check DISTRIBUTION is EXTERNAL |
IN_PROGRESS | Scan running | Wait |
APPROVED | Passed | Can publish |
REJECTED | Failed | Fix issues or appeal |
MANUAL_REVIEW | Human reviewing | Wait (can take days) |
Triggers manual review : External access integrations, Streamlit components, network calls.
| Field | Max Length | Notes |
|---|---|---|
| Title | 72 chars | App name |
| Subtitle | 128 chars | One-liner |
| Description | 10,000 chars | HTML editor |
| Business Needs | 6 max | Select from dropdown |
| Quick Start Examples | 10 max | Title + Description + SQL |
| Data Dictionary | Required | Mandatory for data listings (2025) |
---|---
1 | Full Snowflake account (not trial)
2 | ACCOUNTADMIN role
3 | Provider Profile approved
4 | Stripe account configured
5 | Provider & Consumer Terms accepted
6 | Contact Marketplace Ops
Note : Cannot convert free listing to paid. Must create new listing.
from snowflake.snowpark import Session
connection_params = {
"account": "orgname-accountname",
"user": "USERNAME",
"password": "PASSWORD", # Or use private_key_path
"warehouse": "COMPUTE_WH",
"database": "MY_DB",
"schema": "PUBLIC"
}
session = Session.builder.configs(connection_params).create()
# Read table
df = session.table("MY_TABLE")
# Filter and select
result = df.filter(df["STATUS"] == "ACTIVE") \
.select("ID", "NAME", "CREATED_AT") \
.sort("CREATED_AT", ascending=False)
# Execute
result.show()
# Collect to Python
rows = result.collect()
# WRONG - dict() doesn't work on Snowpark Row
config = dict(result[0])
# CORRECT - Access columns explicitly
row = result[0]
config = {
'COLUMN_A': row['COLUMN_A'],
'COLUMN_B': row['COLUMN_B'],
}
New in January 2026 : SnowflakeCursor.stats property exposes granular DML statistics for operations where rowcount is insufficient (e.g., CTAS queries).
# Before v4.2.0 - rowcount returns -1 for CTAS
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.rowcount) # Returns -1 (not helpful!)
# After v4.2.0 - stats property shows actual row counts
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.stats) # Returns {'rows_inserted': 1234, 'duplicates': 0, ...}
Source : Release v4.2.0
from snowflake.snowpark.functions import udf, sproc
# Register UDF
@udf(name="my_udf", replace=True)
def my_udf(x: int) -> int:
return x * 2
# Register Stored Procedure
@sproc(name="my_sproc", replace=True)
def my_sproc(session: Session, table_name: str) -> str:
df = session.table(table_name)
count = df.count()
return f"Row count: {count}"
The REST API is the foundation for programmatic Snowflake access from Cloudflare Workers.
https://{org-account}.snowflakecomputing.com/api/v2/statements
ALL requests must include these headers - missing Accept causes silent failures:
const headers = {
'Authorization': `Bearer ${jwt}`,
'Content-Type': 'application/json',
'Accept': 'application/json', // REQUIRED - "null" error if missing
'User-Agent': 'MyApp/1.0',
};
Even simple queries return async (HTTP 202). Always implement polling:
// Submit returns statementHandle, not results
const submit = await fetch(url, { method: 'POST', headers, body });
const { statementHandle } = await submit.json();
// Poll until complete
while (true) {
const status = await fetch(`${url}/${statementHandle}`, { headers });
if (status.status === 200) break; // Complete
if (status.status === 202) {
await sleep(2000); // Still running
continue;
}
}
| Plan | Limit | Safe Polling |
|---|---|---|
| Free | 50 | 45 attempts @ 2s = 90s max |
| Paid | 1,000 | 100 attempts @ 500ms = 50s max |
Workers fetch() has no default timeout. Always use AbortController:
const response = await fetch(url, {
signal: AbortSignal.timeout(30000), // 30 seconds
headers,
});
Cancel queries when timeout occurs to avoid warehouse costs:
POST /api/v2/statements/{statementHandle}/cancel
See templates/snowflake-rest-client.ts for complete implementation.
Symptom : JWT auth fails silently, queries don't appear in Query History.
Cause : Using org-account format in JWT claims instead of account locator.
Fix : Use SELECT CURRENT_ACCOUNT() to get the actual account locator.
Symptom : API calls fail after snow app run.
Cause : External access integration attachment resets on every deploy.
Fix : Re-run ALTER STREAMLIT ... SET EXTERNAL_ACCESS_INTEGRATIONS after each deploy.
Symptom : ALTER APPLICATION PACKAGE ... SET DEFAULT RELEASE DIRECTIVE fails.
Cause : Legacy SQL syntax doesn't work with release channels enabled.
Fix : Use snow CLI: snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
Symptom : Files appear in streamlit/streamlit/ instead of streamlit/.
Cause : Directory mappings in snowflake.yml nest the folder name.
Fix : List individual files explicitly in artifacts, not directories.
Symptom : "A view that is added to the shared content cannot reference objects from other databases"
Cause : Missing GRANT REFERENCE_USAGE ON DATABASE for shared data.
Fix : Always grant REFERENCE_USAGE before snow app run when using external databases.
Symptom : "Unsupported Accept header null is specified" on polling requests.
Cause : Initial request had Accept: application/json but polling request didn't.
Fix : Use consistent headers helper function for ALL requests (submit, poll, cancel).
Symptom : Worker hangs indefinitely waiting for Snowflake response.
Cause : Cloudflare Workers' fetch() has no default timeout.
Fix : Always use AbortSignal.timeout(30000) on all Snowflake requests.
Symptom : "Too many subrequests" error during polling.
Cause : Polling every 1 second × 600 attempts = 600 subrequests exceeds limits.
Fix : Poll every 2-5 seconds, limit to 45 (free) or 100 (paid) attempts.
Symptom : Queries return statementHandle but never complete (code 090001 indefinitely).
Cause : 090001 means "running" not error. Warehouse IS resuming, just takes time.
Fix : Auto-resume works. Wait longer or explicitly resume first: POST /api/v2/warehouses/{wh}:resume
Error : Long-running Python applications show memory growth over time Source : GitHub Issue #2727, #2725 Affects : snowflake-connector-python 4.0.0 - 4.2.0
Why It Happens :
SessionManager uses defaultdict which prevents garbage collectionSnowflakeRestful.fetch() holds references that leak during query executionPrevention : Reuse connections rather than creating new ones repeatedly. Fix is in progress via PR #2741 and PR #2726.
# AVOID - creates new connection each iteration
for i in range(1000):
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.close()
conn.close()
# BETTER - reuse connection
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
for i in range(1000):
cursor.execute("SELECT 1")
cursor.close()
conn.close()
Status : Fix expected in connector v4.3.0 or later
Error : "Request throttled due to high usage. Please retry." Source : Snowflake Cortex Documentation Affects : All Cortex AI functions (COMPLETE, FILTER, CLASSIFY, etc.)
Why It Happens : AI/LLM requests may be throttled during high usage periods to manage platform capacity. Throttled requests return errors and require manual retries.
Prevention : Implement retry logic with exponential backoff:
import time
import snowflake.connector
def execute_with_retry(cursor, query, max_retries=3):
for attempt in range(max_retries):
try:
return cursor.execute(query).fetchall()
except snowflake.connector.errors.DatabaseError as e:
if "throttled" in str(e).lower() and attempt < max_retries - 1:
wait_time = 2 ** attempt # Exponential backoff
time.sleep(wait_time)
else:
raise
Status : Documented behavior, no fix planned
streamlit-snowflake - Streamlit in Snowflake appsWeekly Installs
326
Repository
GitHub Stars
650
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
claude-code267
opencode212
gemini-cli212
cursor198
antigravity192
codex185
Azure 配额管理指南:服务限制、容量验证与配额增加方法
77,500 周安装
后端开发模式与架构设计:RESTful API、仓储模式、服务层与中间件最佳实践
317 周安装
ts-agent-sdk:为AI代理生成类型化TypeScript SDK,简化MCP服务器交互
317 周安装
fetch-tweet:无需JavaScript获取Twitter/X推文数据,支持原文、作者和互动统计
318 周安装
AI团队自动组建与任务执行工具 - Team Assemble 技能详解
318 周安装
GitLab工作流最佳实践指南:合并请求、CI/CD流水线与DevOps最佳实践
318 周安装
Salesforce Apex 代码生成与审查工具 - sf-apex 技能详解
318 周安装
AI_SENTIMENT| Sentiment analysis |
| GA Jul 2025 |
AI_FILTER | Natural language filtering | GA Nov 2025 |
AI_CLASSIFY | Categorize text/images | GA Nov 2025 |
AI_AGG | Aggregate insights across rows | GA Nov 2025 |