modal by davila7/claude-code-templates
npx skills add https://github.com/davila7/claude-code-templates --skill modalModal 是一个无服务器平台,用于在云中以最少的配置运行 Python 代码。在强大的 GPU 上执行函数,自动扩展到数千个容器,并且只为使用的计算资源付费。
Modal 特别适合 AI/ML 工作负载、高性能批处理、计划任务、GPU 推理和无服务器 API。在 https://modal.com 免费注册,每月可获得 30 美元的额度。
在以下场景使用 Modal:
Modal 需要通过 API 令牌进行身份验证。
# 安装 Modal
uv uv pip install modal
# 身份验证(打开浏览器登录)
modal token new
这将在 ~/.modal.toml 中创建一个存储的令牌。该令牌用于验证所有 Modal 操作。
import modal
app = modal.App("test-app")
@app.function()
def hello():
print("Modal is working!")
运行命令:modal run script.py
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
Modal 通过在容器中运行的函数提供无服务器 Python 执行。以声明方式定义计算需求、依赖项和扩展行为。
使用 Modal 镜像为函数指定依赖项和环境。
import modal
# 包含 Python 包的基础镜像
image = (
modal.Image.debian_slim(python_version="3.12")
.uv_pip_install("torch", "transformers", "numpy")
)
app = modal.App("ml-app", image=image)
常见模式:
.uv_pip_install("pandas", "scikit-learn").apt_install("ffmpeg", "git")modal.Image.from_registry("nvidia/cuda:12.1.0-base").add_local_python_source("my_module")有关全面的镜像构建文档,请参阅 references/images.md。
使用 @app.function() 装饰器定义在云中运行的函数。
@app.function()
def process_data(file_path: str):
import pandas as pd
df = pd.read_csv(file_path)
return df.describe()
调用函数:
# 从本地入口点调用
@app.local_entrypoint()
def main():
result = process_data.remote("data.csv")
print(result)
运行命令:modal run script.py
有关函数模式、部署和参数处理的详细信息,请参阅 references/functions.md。
为函数附加 GPU 以进行加速计算。
@app.function(gpu="H100")
def train_model():
import torch
assert torch.cuda.is_available()
# 此处为 GPU 加速代码
可用的 GPU 类型:
T4, L4 - 经济高效的推理A10, A100, A100-80GB - 标准训练/推理L40S - 出色的性价比平衡(48GB)H100, H200 - 高性能训练B200 - 旗舰性能(最强大)请求多个 GPU:
@app.function(gpu="H100:8") # 8 个 H100 GPU
def train_large_model():
pass
有关 GPU 选择指南、CUDA 设置和多 GPU 配置,请参阅 references/gpu.md。
为函数请求 CPU 核心、内存和磁盘。
@app.function(
cpu=8.0, # 8 个物理核心
memory=32768, # 32 GiB RAM
ephemeral_disk=10240 # 10 GiB 磁盘
)
def memory_intensive_task():
pass
默认分配:0.125 个 CPU 核心,128 MiB 内存。计费基于预留或实际使用量中的较高者。
有关资源限制和计费详情,请参阅 references/resources.md。
Modal 根据需求将函数从零自动扩展到数千个容器。
并行处理输入:
@app.function()
def analyze_sample(sample_id: int):
# 处理单个样本
return result
@app.local_entrypoint()
def main():
sample_ids = range(1000)
# 自动在容器间并行化
results = list(analyze_sample.map(sample_ids))
配置自动扩展:
@app.function(
max_containers=100, # 上限
min_containers=2, # 保持预热
buffer_containers=5 # 应对突发流量的空闲缓冲区
)
def inference():
pass
有关自动扩展配置、并发和扩展限制,请参阅 references/scaling.md。
使用 Volumes 在函数调用之间进行持久化存储。
volume = modal.Volume.from_name("my-data", create_if_missing=True)
@app.function(volumes={"/data": volume})
def save_results(data):
with open("/data/results.txt", "w") as f:
f.write(data)
volume.commit() # 持久化更改
Volumes 在运行之间持久化数据,存储模型权重,缓存数据集,并在函数之间共享数据。
有关卷管理、提交和缓存模式,请参阅 references/volumes.md。
使用 Modal Secrets 安全地存储 API 密钥和凭据。
@app.function(secrets=[modal.Secret.from_name("huggingface")])
def download_model():
import os
token = os.environ["HF_TOKEN"]
# 使用令牌进行身份验证
在 Modal 仪表板或通过 CLI 创建密钥:
modal secret create my-secret KEY=value API_TOKEN=xyz
有关密钥管理和身份验证模式,请参阅 references/secrets.md。
使用 @modal.web_endpoint() 提供 HTTP 端点、API 和 webhook。
@app.function()
@modal.web_endpoint(method="POST")
def predict(data: dict):
# 处理请求
result = model.predict(data["input"])
return {"prediction": result}
部署命令:
modal deploy script.py
Modal 为端点提供 HTTPS URL。
有关 FastAPI 集成、流式传输、身份验证和 WebSocket 支持,请参阅 references/web-endpoints.md。
使用 cron 表达式按计划运行函数。
@app.function(schedule=modal.Cron("0 2 * * *")) # 每天凌晨 2 点
def daily_backup():
# 备份数据
pass
@app.function(schedule=modal.Period(hours=4)) # 每 4 小时
def refresh_cache():
# 更新缓存
pass
计划函数会自动运行,无需手动调用。
有关 cron 语法、时区配置和监控,请参阅 references/scheduled-jobs.md。
import modal
# 定义依赖项
image = modal.Image.debian_slim().uv_pip_install("torch", "transformers")
app = modal.App("llm-inference", image=image)
# 在构建时下载模型
@app.function()
def download_model():
from transformers import AutoModel
AutoModel.from_pretrained("bert-base-uncased")
# 提供模型服务
@app.cls(gpu="L40S")
class Model:
@modal.enter()
def load_model(self):
from transformers import pipeline
self.pipe = pipeline("text-classification", device="cuda")
@modal.method()
def predict(self, text: str):
return self.pipe(text)
@app.local_entrypoint()
def main():
model = Model()
result = model.predict.remote("Modal is great!")
print(result)
@app.function(cpu=2.0, memory=4096)
def process_file(file_path: str):
import pandas as pd
df = pd.read_csv(file_path)
# 处理数据
return df.shape[0]
@app.local_entrypoint()
def main():
files = ["file1.csv", "file2.csv", ...] # 数千个文件
# 自动在容器间并行化
for count in process_file.map(files):
print(f"Processed {count} rows")
@app.function(
gpu="A100:2", # 2 个 A100 GPU
timeout=3600 # 1 小时超时
)
def train_model(config: dict):
import torch
# 多 GPU 训练代码
model = create_model(config)
train(model)
return metrics
特定功能的详细文档:
references/getting-started.md - 身份验证、设置、基本概念references/images.md - 镜像构建、依赖项、Dockerfilesreferences/functions.md - 函数模式、部署、参数references/gpu.md - GPU 类型、CUDA、多 GPU 配置references/resources.md - CPU、内存、磁盘管理references/scaling.md - 自动扩展、并行执行、并发references/volumes.md - 持久化存储、数据管理references/secrets.md - 环境变量、身份验证references/web-endpoints.md - API、webhook、端点references/scheduled-jobs.md - Cron 任务、周期性任务references/examples.md - 科学计算的常见模式.uv_pip_install() 中固定依赖项以实现可重复构建max_containers 和 min_containers.map() 进行并行处理 而不是顺序循环"Module not found" 错误:
.uv_pip_install("package-name") 将包添加到镜像中未检测到 GPU:
@app.function(gpu="A100")torch.cuda.is_available()函数超时:
@app.function(timeout=3600)卷更改未持久化:
volume.commit()如需更多帮助,请参阅 Modal 文档 https://modal.com/docs 或加入 Modal Slack 社区。
每周安装数
137
仓库
GitHub 星标数
22.6K
首次出现
2026 年 1 月 21 日
安全审计
安装于
claude-code115
opencode107
cursor99
gemini-cli98
antigravity92
codex90
Modal is a serverless platform for running Python code in the cloud with minimal configuration. Execute functions on powerful GPUs, scale automatically to thousands of containers, and pay only for compute used.
Modal is particularly suited for AI/ML workloads, high-performance batch processing, scheduled jobs, GPU inference, and serverless APIs. Sign up for free at https://modal.com and receive $30/month in credits.
Use Modal for:
Modal requires authentication via API token.
# Install Modal
uv uv pip install modal
# Authenticate (opens browser for login)
modal token new
This creates a token stored in ~/.modal.toml. The token authenticates all Modal operations.
import modal
app = modal.App("test-app")
@app.function()
def hello():
print("Modal is working!")
Run with: modal run script.py
Modal provides serverless Python execution through Functions that run in containers. Define compute requirements, dependencies, and scaling behavior declaratively.
Specify dependencies and environment for functions using Modal Images.
import modal
# Basic image with Python packages
image = (
modal.Image.debian_slim(python_version="3.12")
.uv_pip_install("torch", "transformers", "numpy")
)
app = modal.App("ml-app", image=image)
Common patterns:
.uv_pip_install("pandas", "scikit-learn").apt_install("ffmpeg", "git")modal.Image.from_registry("nvidia/cuda:12.1.0-base").add_local_python_source("my_module")See references/images.md for comprehensive image building documentation.
Define functions that run in the cloud with the @app.function() decorator.
@app.function()
def process_data(file_path: str):
import pandas as pd
df = pd.read_csv(file_path)
return df.describe()
Call functions:
# From local entrypoint
@app.local_entrypoint()
def main():
result = process_data.remote("data.csv")
print(result)
Run with: modal run script.py
See references/functions.md for function patterns, deployment, and parameter handling.
Attach GPUs to functions for accelerated computation.
@app.function(gpu="H100")
def train_model():
import torch
assert torch.cuda.is_available()
# GPU-accelerated code here
Available GPU types:
T4, L4 - Cost-effective inferenceA10, A100, A100-80GB - Standard training/inferenceL40S - Excellent cost/performance balance (48GB)H100, H200 - High-performance trainingB200 - Flagship performance (most powerful)Request multiple GPUs:
@app.function(gpu="H100:8") # 8x H100 GPUs
def train_large_model():
pass
See references/gpu.md for GPU selection guidance, CUDA setup, and multi-GPU configuration.
Request CPU cores, memory, and disk for functions.
@app.function(
cpu=8.0, # 8 physical cores
memory=32768, # 32 GiB RAM
ephemeral_disk=10240 # 10 GiB disk
)
def memory_intensive_task():
pass
Default allocation: 0.125 CPU cores, 128 MiB memory. Billing based on reservation or actual usage, whichever is higher.
See references/resources.md for resource limits and billing details.
Modal autoscales functions from zero to thousands of containers based on demand.
Process inputs in parallel:
@app.function()
def analyze_sample(sample_id: int):
# Process single sample
return result
@app.local_entrypoint()
def main():
sample_ids = range(1000)
# Automatically parallelized across containers
results = list(analyze_sample.map(sample_ids))
Configure autoscaling:
@app.function(
max_containers=100, # Upper limit
min_containers=2, # Keep warm
buffer_containers=5 # Idle buffer for bursts
)
def inference():
pass
See references/scaling.md for autoscaling configuration, concurrency, and scaling limits.
Use Volumes for persistent storage across function invocations.
volume = modal.Volume.from_name("my-data", create_if_missing=True)
@app.function(volumes={"/data": volume})
def save_results(data):
with open("/data/results.txt", "w") as f:
f.write(data)
volume.commit() # Persist changes
Volumes persist data between runs, store model weights, cache datasets, and share data between functions.
See references/volumes.md for volume management, commits, and caching patterns.
Store API keys and credentials securely using Modal Secrets.
@app.function(secrets=[modal.Secret.from_name("huggingface")])
def download_model():
import os
token = os.environ["HF_TOKEN"]
# Use token for authentication
Create secrets in Modal dashboard or via CLI:
modal secret create my-secret KEY=value API_TOKEN=xyz
See references/secrets.md for secret management and authentication patterns.
Serve HTTP endpoints, APIs, and webhooks with @modal.web_endpoint().
@app.function()
@modal.web_endpoint(method="POST")
def predict(data: dict):
# Process request
result = model.predict(data["input"])
return {"prediction": result}
Deploy with:
modal deploy script.py
Modal provides HTTPS URL for the endpoint.
See references/web-endpoints.md for FastAPI integration, streaming, authentication, and WebSocket support.
Run functions on a schedule with cron expressions.
@app.function(schedule=modal.Cron("0 2 * * *")) # Daily at 2 AM
def daily_backup():
# Backup data
pass
@app.function(schedule=modal.Period(hours=4)) # Every 4 hours
def refresh_cache():
# Update cache
pass
Scheduled functions run automatically without manual invocation.
See references/scheduled-jobs.md for cron syntax, timezone configuration, and monitoring.
import modal
# Define dependencies
image = modal.Image.debian_slim().uv_pip_install("torch", "transformers")
app = modal.App("llm-inference", image=image)
# Download model at build time
@app.function()
def download_model():
from transformers import AutoModel
AutoModel.from_pretrained("bert-base-uncased")
# Serve model
@app.cls(gpu="L40S")
class Model:
@modal.enter()
def load_model(self):
from transformers import pipeline
self.pipe = pipeline("text-classification", device="cuda")
@modal.method()
def predict(self, text: str):
return self.pipe(text)
@app.local_entrypoint()
def main():
model = Model()
result = model.predict.remote("Modal is great!")
print(result)
@app.function(cpu=2.0, memory=4096)
def process_file(file_path: str):
import pandas as pd
df = pd.read_csv(file_path)
# Process data
return df.shape[0]
@app.local_entrypoint()
def main():
files = ["file1.csv", "file2.csv", ...] # 1000s of files
# Automatically parallelized across containers
for count in process_file.map(files):
print(f"Processed {count} rows")
@app.function(
gpu="A100:2", # 2x A100 GPUs
timeout=3600 # 1 hour timeout
)
def train_model(config: dict):
import torch
# Multi-GPU training code
model = create_model(config)
train(model)
return metrics
Detailed documentation for specific features:
references/getting-started.md - Authentication, setup, basic conceptsreferences/images.md - Image building, dependencies, Dockerfilesreferences/functions.md - Function patterns, deployment, parametersreferences/gpu.md - GPU types, CUDA, multi-GPU configurationreferences/resources.md - CPU, memory, disk managementreferences/scaling.md - Autoscaling, parallel execution, concurrencyreferences/volumes.md - Persistent storage, data management.uv_pip_install() for reproducible buildsmax_containers and min_containers based on workload.map() for parallel processing instead of sequential loops"Module not found" errors:
.uv_pip_install("package-name")GPU not detected:
@app.function(gpu="A100")torch.cuda.is_available()Function timeout:
@app.function(timeout=3600)Volume changes not persisting:
volume.commit() after writing filesFor additional help, see Modal documentation at https://modal.com/docs or join Modal Slack community.
Weekly Installs
137
Repository
GitHub Stars
22.6K
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubFailSocketPassSnykWarn
Installed on
claude-code115
opencode107
cursor99
gemini-cli98
antigravity92
codex90
工作委派专家技能:为编码代理创建完整委派包,提升开发效率
88 周安装
outside-in-testing 测试技能:兼容性别名与 qa-team 迁移指南 | 软件测试与质量保证
88 周安装
AI代码审查员 - 自动化代码审查工具,提升代码质量与安全性 | 代码审查专家
89 周安装
PowerShell GUI/TUI 架构师 - 构建 WinForms、WPF 和终端用户界面的专业指南
89 周安装
AgentMail MCP 服务器:连接 AI 助手与电子邮件服务的完整指南
89 周安装
AWS Bedrock AgentCore 代码解释器:安全沙盒环境执行 Python/JS/TS 代码
89 周安装
references/secrets.mdreferences/web-endpoints.md - APIs, webhooks, endpointsreferences/scheduled-jobs.md - Cron jobs, periodic tasksreferences/examples.md - Common patterns for scientific computing