重要前提
安装AI Skills的关键前提是:必须科学上网,且开启TUN模式,这一点至关重要,直接决定安装能否顺利完成,在此郑重提醒三遍:科学上网,科学上网,科学上网。查看完整安装教程 →
modal-knowledge by josiahsiegel/claude-plugin-marketplace
npx skills add https://github.com/josiahsiegel/claude-plugin-marketplace --skill modal-knowledge全面的 Modal.com 平台知识,涵盖所有功能、定价和最佳实践。当用户需要关于 Modal 无服务器云平台的详细信息时,激活此技能。
当用户询问以下内容时激活此技能:
Modal 是一个用于运行 Python 代码的无服务器云平台,针对 AI/ML 工作负载进行了优化,具有以下特点:
import modal
app = modal.App("app-name")
@app.function()
def basic_function(arg: str) -> str:
return f"Result: {arg}"
@app.local_entrypoint()
def main():
result = basic_function.remote("test")
print(result)
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
| 参数 | 类型 | 描述 |
|---|---|---|
image | Image | 容器镜像配置 |
gpu | str/list | GPU 类型:"T4"、"A100"、["H100", "A100"] |
cpu | float | CPU 核心数(0.125 到 64) |
memory | int | 内存,单位 MB(128 到 262144) |
timeout | int | 最大执行秒数 |
retries | int | 失败时的重试次数 |
secrets | list | 要注入的密钥 |
volumes | dict | 卷挂载点 |
schedule | Cron/Period | 计划执行 |
concurrency_limit | int | 最大并发执行数 |
container_idle_timeout | int | 保持容器预热状态的秒数 |
include_source | bool | 自动同步源代码 |
| GPU | 内存 | 使用场景 | 约成本/小时 |
|---|---|---|---|
| T4 | 16 GB | 小型推理 | $0.59 |
| L4 | 24 GB | 中型推理 | $0.80 |
| A10G | 24 GB | 推理/微调 | $1.10 |
| L40S | 48 GB | 重型推理 | $1.50 |
| A100-40GB | 40 GB | 训练 | $2.00 |
| A100-80GB | 80 GB | 大型模型 | $3.00 |
| H100 | 80 GB | 前沿模型 | $5.00 |
| H200 | 141 GB | 最大模型 | $5.00 |
| B200 | 180+ GB | 最新一代 | $6.25 |
# 单 GPU
@app.function(gpu="A100")
# 特定内存版本
@app.function(gpu="A100-80GB")
# 多 GPU
@app.function(gpu="H100:4")
# 后备方案(按顺序尝试)
@app.function(gpu=["H100", "A100", "any"])
# "any" = L4、A10G 或 T4
@app.function(gpu="any")
# Debian slim(推荐)
modal.Image.debian_slim(python_version="3.11")
# 从 Dockerfile
modal.Image.from_dockerfile("./Dockerfile")
# 从 Docker 注册表
modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
# pip(标准)
image.pip_install("torch", "transformers")
# uv(更快 - 10-100 倍)
image.uv_pip_install("torch", "transformers")
# 系统包
image.apt_install("ffmpeg", "libsm6")
# Shell 命令
image.run_commands("apt-get update", "make install")
# 单个文件
image.add_local_file("./config.json", "/app/config.json")
# 目录
image.add_local_dir("./models", "/app/models")
# Python 源代码
image.add_local_python_source("my_module")
# 环境变量
image.env({"VAR": "value"})
def download_model():
from huggingface_hub import snapshot_download
snapshot_download("model-name")
image.run_function(download_model, secrets=[...])
# 创建/引用卷
vol = modal.Volume.from_name("my-vol", create_if_missing=True)
# 在函数中挂载
@app.function(volumes={"/data": vol})
def func():
# 读写 /data
vol.commit() # 持久化更改
# 从仪表板(推荐)
modal.Secret.from_name("secret-name")
# 从字典
modal.Secret.from_dict({"KEY": "value"})
# 从本地环境变量
modal.Secret.from_local_environ(["KEY1", "KEY2"])
# 从 .env 文件
modal.Secret.from_dotenv()
# 使用
@app.function(secrets=[modal.Secret.from_name("api-keys")])
def func():
import os
key = os.environ["API_KEY"]
# 分布式字典
d = modal.Dict.from_name("cache", create_if_missing=True)
d["key"] = "value"
d.put("key", "value", ttl=3600)
# 分布式队列
q = modal.Queue.from_name("jobs", create_if_missing=True)
q.put("task")
item = q.get()
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
return {"message": f"Hello, {name}!"}
from fastapi import FastAPI
web_app = FastAPI()
@web_app.post("/predict")
def predict(text: str):
return {"result": process(text)}
@app.function()
@modal.asgi_app()
def fastapi_app():
return web_app
from flask import Flask
flask_app = Flask(__name__)
@app.function()
@modal.wsgi_app()
def flask_endpoint():
return flask_app
@app.function()
@modal.web_server(port=8000)
def custom_server():
subprocess.run(["python", "-m", "http.server", "8000"])
@modal.asgi_app(custom_domains=["api.example.com"])
# 每天 UTC 时间 8 点
@app.function(schedule=modal.Cron("0 8 * * *"))
# 带时区
@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))
注意: 计划函数仅在使用 modal deploy 时运行,而不是 modal run。
# 并行执行(最多 1000 个并发)
results = list(func.map(items))
# 无序(更快)
results = list(func.map(items, order_outputs=False))
# 展开参数
pairs = [(1, 2), (3, 4)]
results = list(add.starmap(pairs))
# 异步作业(立即返回)
call = func.spawn(data)
result = call.get() # 稍后获取结果
# 生成多个
calls = [func.spawn(item) for item in items]
results = [call.get() for call in calls]
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:
@modal.enter()
def load(self):
self.model = load_model()
@modal.method()
def predict(self, text):
return self.model(text)
@modal.exit()
def cleanup(self):
del self.model
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
pass
modal run app.py # 运行函数
modal serve app.py # 热重载开发服务器
modal shell app.py # 交互式 shell
modal shell app.py --gpu A100 # 带 GPU 的 shell
modal deploy app.py # 部署
modal app list # 列出应用
modal app logs app-name # 查看日志
modal app stop app-name # 停止应用
# 卷
modal volume create name
modal volume list
modal volume put name local remote
modal volume get name remote local
# 密钥
modal secret create name KEY=value
modal secret list
# 环境
modal environment create staging
| 套餐 | 价格 | 容器数 | GPU 并发数 |
|---|---|---|---|
| Starter | 免费($30 额度) | 100 | 10 |
| Team | $250/月 | 1000 | 50 |
| Enterprise | 定制 | 无限制 | 定制 |
@modal.enter() 加载模型uv_pip_install 以加速构建order_outputs=Falsecontainer_idle_timeout 以平衡成本和延迟modal deploy 之前使用 modal run 进行测试@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
@modal.enter()
def load(self):
from vllm import LLM
self.llm = LLM(model="...")
@modal.method()
def generate(self, prompt):
return self.llm.generate([prompt])
@app.function(volumes={"/data": vol})
def process(file):
# 处理文件
vol.commit()
# 并行
results = list(process.map(files))
@app.function(
schedule=modal.Cron("0 6 * * *"),
secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
extract()
transform()
load()
| 任务 | 代码 |
|---|---|
| 创建应用 | app = modal.App("name") |
| 基本函数 | @app.function() |
| 使用 GPU | @app.function(gpu="A100") |
| 使用镜像 | @app.function(image=img) |
| Web 端点 | @modal.asgi_app() |
| 计划任务 | schedule=modal.Cron("...") |
| 挂载卷 | volumes={"/path": vol} |
| 使用密钥 | secrets=[modal.Secret.from_name("x")] |
| 并行 map | func.map(items) |
| 异步 spawn | func.spawn(arg) |
| 类模式 | @app.cls() 配合 @modal.enter() |
每周安装数
64
仓库
GitHub 星标数
21
首次出现
Jan 24, 2026
安全审计
安装于
claude-code51
gemini-cli50
opencode50
codex47
cursor46
github-copilot42
Comprehensive Modal.com platform knowledge covering all features, pricing, and best practices. Activate this skill when users need detailed information about Modal's serverless cloud platform.
Activate this skill when users ask about:
Modal is a serverless cloud platform for running Python code, optimized for AI/ML workloads with:
import modal
app = modal.App("app-name")
@app.function()
def basic_function(arg: str) -> str:
return f"Result: {arg}"
@app.local_entrypoint()
def main():
result = basic_function.remote("test")
print(result)
| Parameter | Type | Description |
|---|---|---|
image | Image | Container image configuration |
gpu | str/list | GPU type(s): "T4", "A100", ["H100", "A100"] |
cpu | float | CPU cores (0.125 to 64) |
memory | int | Memory in MB (128 to 262144) |
timeout | int | Max execution seconds |
| GPU | Memory | Use Case | ~Cost/hr |
|---|---|---|---|
| T4 | 16 GB | Small inference | $0.59 |
| L4 | 24 GB | Medium inference | $0.80 |
| A10G | 24 GB | Inference/fine-tuning | $1.10 |
| L40S | 48 GB | Heavy inference | $1.50 |
| A100-40GB | 40 GB | Training | $2.00 |
| A100-80GB | 80 GB | Large models | $3.00 |
| H100 | 80 GB | Cutting-edge | $5.00 |
| H200 |
# Single GPU
@app.function(gpu="A100")
# Specific memory variant
@app.function(gpu="A100-80GB")
# Multi-GPU
@app.function(gpu="H100:4")
# Fallbacks (tries in order)
@app.function(gpu=["H100", "A100", "any"])
# "any" = L4, A10G, or T4
@app.function(gpu="any")
# Debian slim (recommended)
modal.Image.debian_slim(python_version="3.11")
# From Dockerfile
modal.Image.from_dockerfile("./Dockerfile")
# From Docker registry
modal.Image.from_registry("nvidia/cuda:12.1.0-base-ubuntu22.04")
# pip (standard)
image.pip_install("torch", "transformers")
# uv (FASTER - 10-100x)
image.uv_pip_install("torch", "transformers")
# System packages
image.apt_install("ffmpeg", "libsm6")
# Shell commands
image.run_commands("apt-get update", "make install")
# Single file
image.add_local_file("./config.json", "/app/config.json")
# Directory
image.add_local_dir("./models", "/app/models")
# Python source
image.add_local_python_source("my_module")
# Environment variables
image.env({"VAR": "value"})
def download_model():
from huggingface_hub import snapshot_download
snapshot_download("model-name")
image.run_function(download_model, secrets=[...])
# Create/reference volume
vol = modal.Volume.from_name("my-vol", create_if_missing=True)
# Mount in function
@app.function(volumes={"/data": vol})
def func():
# Read/write to /data
vol.commit() # Persist changes
# From dashboard (recommended)
modal.Secret.from_name("secret-name")
# From dictionary
modal.Secret.from_dict({"KEY": "value"})
# From local env
modal.Secret.from_local_environ(["KEY1", "KEY2"])
# From .env file
modal.Secret.from_dotenv()
# Usage
@app.function(secrets=[modal.Secret.from_name("api-keys")])
def func():
import os
key = os.environ["API_KEY"]
# Distributed dict
d = modal.Dict.from_name("cache", create_if_missing=True)
d["key"] = "value"
d.put("key", "value", ttl=3600)
# Distributed queue
q = modal.Queue.from_name("jobs", create_if_missing=True)
q.put("task")
item = q.get()
@app.function()
@modal.fastapi_endpoint()
def hello(name: str = "World"):
return {"message": f"Hello, {name}!"}
from fastapi import FastAPI
web_app = FastAPI()
@web_app.post("/predict")
def predict(text: str):
return {"result": process(text)}
@app.function()
@modal.asgi_app()
def fastapi_app():
return web_app
from flask import Flask
flask_app = Flask(__name__)
@app.function()
@modal.wsgi_app()
def flask_endpoint():
return flask_app
@app.function()
@modal.web_server(port=8000)
def custom_server():
subprocess.run(["python", "-m", "http.server", "8000"])
@modal.asgi_app(custom_domains=["api.example.com"])
# Daily at 8 AM UTC
@app.function(schedule=modal.Cron("0 8 * * *"))
# With timezone
@app.function(schedule=modal.Cron("0 6 * * *", timezone="America/New_York"))
@app.function(schedule=modal.Period(hours=5))
@app.function(schedule=modal.Period(days=1))
Note: Scheduled functions only run with modal deploy, not modal run.
# Parallel execution (up to 1000 concurrent)
results = list(func.map(items))
# Unordered (faster)
results = list(func.map(items, order_outputs=False))
# Spread args
pairs = [(1, 2), (3, 4)]
results = list(add.starmap(pairs))
# Async job (returns immediately)
call = func.spawn(data)
result = call.get() # Get result later
# Spawn many
calls = [func.spawn(item) for item in items]
results = [call.get() for call in calls]
@app.cls(gpu="A100", container_idle_timeout=300)
class Server:
@modal.enter()
def load(self):
self.model = load_model()
@modal.method()
def predict(self, text):
return self.model(text)
@modal.exit()
def cleanup(self):
del self.model
@modal.concurrent(max_inputs=100, target_inputs=80)
@modal.method()
def batched(self, item):
pass
modal run app.py # Run function
modal serve app.py # Hot-reload dev server
modal shell app.py # Interactive shell
modal shell app.py --gpu A100 # Shell with GPU
modal deploy app.py # Deploy
modal app list # List apps
modal app logs app-name # View logs
modal app stop app-name # Stop app
# Volumes
modal volume create name
modal volume list
modal volume put name local remote
modal volume get name remote local
# Secrets
modal secret create name KEY=value
modal secret list
# Environments
modal environment create staging
| Plan | Price | Containers | GPU Concurrency |
|---|---|---|---|
| Starter | Free ($30 credits) | 100 | 10 |
| Team | $250/month | 1000 | 50 |
| Enterprise | Custom | Unlimited | Custom |
@modal.enter() for model loadinguv_pip_install for faster buildsorder_outputs=False when order doesn't mattercontainer_idle_timeout to balance cost/latencymodal run before modal deploy@app.cls(gpu="A100", container_idle_timeout=300)
class LLM:
@modal.enter()
def load(self):
from vllm import LLM
self.llm = LLM(model="...")
@modal.method()
def generate(self, prompt):
return self.llm.generate([prompt])
@app.function(volumes={"/data": vol})
def process(file):
# Process file
vol.commit()
# Parallel
results = list(process.map(files))
@app.function(
schedule=modal.Cron("0 6 * * *"),
secrets=[modal.Secret.from_name("db")]
)
def daily_etl():
extract()
transform()
load()
| Task | Code |
|---|---|
| Create app | app = modal.App("name") |
| Basic function | @app.function() |
| With GPU | @app.function(gpu="A100") |
| With image | @app.function(image=img) |
| Web endpoint | @modal.asgi_app() |
| Scheduled | schedule=modal.Cron("...") |
Weekly Installs
64
Repository
GitHub Stars
21
First Seen
Jan 24, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykFail
Installed on
claude-code51
gemini-cli50
opencode50
codex47
cursor46
github-copilot42
Azure Data Explorer (Kusto) 查询技能:KQL数据分析、日志遥测与时间序列处理
145,500 周安装
retries | int | Retry attempts on failure |
secrets | list | Secrets to inject |
volumes | dict | Volume mount points |
schedule | Cron/Period | Scheduled execution |
concurrency_limit | int | Max concurrent executions |
container_idle_timeout | int | Seconds to keep warm |
include_source | bool | Auto-sync source code |
| 141 GB |
| Largest models |
| $5.00 |
| B200 | 180+ GB | Latest gen | $6.25 |
| Mount volume | volumes={"/path": vol} |
| Use secret | secrets=[modal.Secret.from_name("x")] |
| Parallel map | func.map(items) |
| Async spawn | func.spawn(arg) |
| Class pattern | @app.cls() with @modal.enter() |