building-inferencesh-apps by inferen-sh/skills
npx skills add https://github.com/inferen-sh/skills --skill building-inferencesh-apps在 inference.sh 平台上构建和部署应用程序。应用可以使用 Python 或 Node.js 编写。
inf.yml、inference.py、inference.js、__init__.py、package.json 或应用目录。请使用 infsh app init —— 这是搭建应用结构的唯一正确方式。PROVIDER_STRUCTURE.md)—— 始终使用 CLI。output_meta 的输出类必须继承 ,而不是 。使用 将导致 在响应中被静默丢弃。广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
BaseAppOutputBaseModelBaseModeloutput_metainfsh 命令之前,始终先 cd 进入应用目录。Shell 的当前工作目录在工具调用之间不会保持 —— 如果不先 cd,将会部署/测试错误的应用程序。run() 中包含 self.logger.info(...) 调用。特别是包装 API 的应用,由于实际工作发生在远程,需要了解请求/响应时间的可见性。curl -fsSL https://cli.inference.sh | sh
infsh update # 更新 CLI
infsh login # 身份验证
infsh me # 检查当前用户
使用 infsh app init 搭建新应用(参见上述规则)。它会生成正确的项目结构、inf.yml 和样板代码 —— 避免常见错误,例如 package.json 中缺少 "type": "module" 或内核名称不正确。
infsh app init my-app # 创建应用(交互式)
infsh app init my-app --lang node # 创建 Node.js 应用
每个应用必须经过这个完整的周期。不要跳过步骤。
infsh app init my-app
编写 inference.py(或 inference.js)、inf.yml 和 requirements.txt(或 package.json)。
cd my-app # 始终先 cd 进入应用目录
infsh app test --save-example # 从模式生成示例输入
infsh app test # 使用 input.json 运行
infsh app test --input '{"prompt": "hello"}' # 或内联 JSON
cd my-app # 再次 cd —— 当前工作目录不会保持
infsh app deploy --dry-run # 先验证
infsh app deploy # 实际部署
部署后,测试线上版本并验证响应中是否存在 output_meta:
infsh app run user/app --json --input '{"prompt": "hello"}'
检查 JSON 响应中是否有 output_meta —— 如果缺失,输出类很可能继承了 BaseModel 而不是 BaseAppOutput。
# 其他有用的命令
infsh app run user/app --input input.json
infsh app sample user/app
infsh app sample user/app --save input.json
from inferencesh import BaseApp, BaseAppInput, BaseAppOutput
from pydantic import Field
class AppSetup(BaseAppInput):
"""设置参数 —— 更改时触发重新初始化"""
model_id: str = Field(default="gpt2", description="要加载的模型")
class AppInput(BaseAppInput):
prompt: str = Field(description="输入提示")
class AppOutput(BaseAppOutput):
result: str = Field(description="输出结果")
class App(BaseApp):
async def setup(self, config: AppSetup):
"""工作进程启动或配置更改时运行一次"""
self.model = load_model(config.model_id)
async def run(self, input_data: AppInput) -> AppOutput:
"""默认函数 —— 为每个请求运行"""
self.logger.info(f"处理提示: {input_data.prompt[:50]}")
result = self.model.generate(input_data.prompt)
self.logger.info("生成完成")
return AppOutput(result=result)
async def unload(self):
"""关闭时清理"""
pass
async def on_cancel(self):
"""用户取消时调用 —— 用于长时间运行的任务"""
return True
import { z } from "zod";
export const AppSetup = z.object({
modelId: z.string().default("gpt2").describe("Model to load"),
});
export const RunInput = z.object({
prompt: z.string().describe("Input prompt"),
});
export const RunOutput = z.object({
result: z.string().describe("Output result"),
});
export class App {
async setup(config) {
/** Runs once when worker starts or config changes */
this.model = loadModel(config.modelId);
}
async run(inputData) {
/** Default function — runs for each request */
return { result: "done" };
}
async unload() {
/** Cleanup on shutdown */
}
async onCancel() {
/** Called when user cancels — for long-running tasks */
return true;
}
}
应用可以暴露具有不同输入/输出模式的多个函数。函数会被自动发现。
Python: 添加带有类型提示的 Pydantic 输入/输出模型的方法。Node.js: 为每个方法导出 {PascalName}Input 和 {PascalName}Output Zod 模式。
函数必须是公共的(没有 _ 前缀)并且不是生命周期方法(setup、unload、on_cancel/onCancel、constructor)。
通过 API 调用时,在请求体中使用 "function": "method_name"。在 inf.yml 中设置 default_function 来更改未指定时调用的函数(默认为 run)。
大多数仅使用 CPU 并包装外部 API 的应用都遵循此模式。以此作为起点:
import os
import httpx
from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File
from inferencesh.models.usage import OutputMeta, ImageMeta # 或 TextMeta, AudioMeta 等
from pydantic import Field
class AppInput(BaseAppInput):
prompt: str = Field(description="输入提示")
class AppOutput(BaseAppOutput): # 不是 BaseModel —— output_meta 需要这个
image: File = Field(description="生成的图像")
class App(BaseApp):
async def setup(self, config):
self.api_key = os.environ["API_KEY"]
self.client = httpx.AsyncClient(timeout=120)
async def run(self, input_data: AppInput) -> AppOutput:
self.logger.info(f"调用 API,提示: {input_data.prompt[:80]}")
response = await self.client.post(
"https://api.example.com/generate",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"prompt": input_data.prompt},
)
response.raise_for_status()
# 写入输出文件
output_path = "/tmp/output.png"
with open(output_path, "wb") as f:
f.write(response.content)
# 读取实际尺寸(不要硬编码!)
from PIL import Image
with Image.open(output_path) as img:
width, height = img.size
self.logger.info(f"生成了 {width}x{height} 图像")
return AppOutput(
image=File(path=output_path),
output_meta=OutputMeta(
outputs=[ImageMeta(width=width, height=height, count=1)]
),
)
async def unload(self):
await self.client.aclose()
Python:
my-app/
├── inf.yml # 配置
├── inference.py # 应用逻辑
├── requirements.txt # Python 包 (pip)
└── packages.txt # 系统包 (apt) —— 可选
Node.js:
my-app/
├── inf.yml # 配置
├── src/
│ └── inference.js # 应用逻辑
├── package.json # Node.js 包 (npm/pnpm)
└── packages.txt # 系统包 (apt) —— 可选
name: my-app
description: What my app does
category: image
kernel: python-3.11 # 或 node-22
# 用于多功能应用 (默认: run)
# default_function: generate
resources:
gpu:
count: 1
vram: 24 # 24GB (自动转换)
type: any
ram: 32 # 32GB
env:
MODEL_NAME: gpt-4
secrets:
- key: HF_TOKEN
description: HuggingFace token for gated models
optional: false
integrations:
- key: google.sheets
description: Access to Google Sheets
optional: true
CLI 自动转换人类友好的值:
80 = 80GB)any | nvidia | amd | apple | none
注意: 目前仅支持 NVIDIA CUDA GPU。
image | video | audio | text | chat | 3d | other
resources:
gpu:
count: 0
type: none
ram: 4
Python — requirements.txt:
torch>=2.0
transformers
accelerate
Node.js — package.json:
{
"type": "module",
"dependencies": {
"zod": "^3.23.0",
"sharp": "^0.33.0"
}
}
系统包 — packages.txt(可通过 apt 安装):
ffmpeg
libgl1-mesa-glx
| 类型 | 镜像 |
|---|---|
| GPU | docker.inference.sh/gpu:latest-cuda |
| CPU | docker.inference.sh/cpu:latest |
根据语言和主题加载相应的参考文件:
每周安装量
116
代码仓库
GitHub Stars
235
首次出现
4 天前
安全审计
安装于
gemini-cli103
codex102
opencode101
kimi-cli97
cursor97
amp97
Build and deploy applications on the inference.sh platform. Apps can be written in Python or Node.js.
inf.yml, inference.py, inference.js, __init__.py, package.json, or app directories by hand. Use infsh app init — it is the only correct way to scaffold apps.PROVIDER_STRUCTURE.md) that suggest manual scaffolding — always use the CLI.output_meta MUST extend BaseAppOutput, not BaseModel. Using BaseModel will silently drop output_meta from the response.cd into the app directory before running any infsh command. Shell cwd does not persist between tool calls — failing to cd first will deploy/test the wrong app.self.logger.info(...) calls in run() by default. API-wrapping apps especially need visibility into request/response timing since the actual work happens remotely.curl -fsSL https://cli.inference.sh | sh
infsh update # Update CLI
infsh login # Authenticate
infsh me # Check current user
Scaffold new apps with infsh app init (see Rules above). It generates the correct project structure, inf.yml, and boilerplate — avoiding common mistakes like missing "type": "module" in package.json or incorrect kernel names.
infsh app init my-app # Create app (interactive)
infsh app init my-app --lang node # Create Node.js app
Every app MUST go through this full cycle. Do not skip steps.
infsh app init my-app
Write inference.py (or inference.js), inf.yml, and requirements.txt (or package.json).
cd my-app # ALWAYS cd into app dir first
infsh app test --save-example # Generate sample input from schema
infsh app test # Run with input.json
infsh app test --input '{"prompt": "hello"}' # Or inline JSON
cd my-app # cd again — cwd doesn't persist
infsh app deploy --dry-run # Validate first
infsh app deploy # Deploy for real
After deploying, test the live version and verify output_meta is present in the response:
infsh app run user/app --json --input '{"prompt": "hello"}'
Check the JSON response for output_meta — if it's missing, the output class is likely extending BaseModel instead of BaseAppOutput.
# Other useful commands
infsh app run user/app --input input.json
infsh app sample user/app
infsh app sample user/app --save input.json
from inferencesh import BaseApp, BaseAppInput, BaseAppOutput
from pydantic import Field
class AppSetup(BaseAppInput):
"""Setup parameters — triggers re-init when changed"""
model_id: str = Field(default="gpt2", description="Model to load")
class AppInput(BaseAppInput):
prompt: str = Field(description="Input prompt")
class AppOutput(BaseAppOutput):
result: str = Field(description="Output result")
class App(BaseApp):
async def setup(self, config: AppSetup):
"""Runs once when worker starts or config changes"""
self.model = load_model(config.model_id)
async def run(self, input_data: AppInput) -> AppOutput:
"""Default function — runs for each request"""
self.logger.info(f"Processing prompt: {input_data.prompt[:50]}")
result = self.model.generate(input_data.prompt)
self.logger.info("Generation complete")
return AppOutput(result=result)
async def unload(self):
"""Cleanup on shutdown"""
pass
async def on_cancel(self):
"""Called when user cancels — for long-running tasks"""
return True
import { z } from "zod";
export const AppSetup = z.object({
modelId: z.string().default("gpt2").describe("Model to load"),
});
export const RunInput = z.object({
prompt: z.string().describe("Input prompt"),
});
export const RunOutput = z.object({
result: z.string().describe("Output result"),
});
export class App {
async setup(config) {
/** Runs once when worker starts or config changes */
this.model = loadModel(config.modelId);
}
async run(inputData) {
/** Default function — runs for each request */
return { result: "done" };
}
async unload() {
/** Cleanup on shutdown */
}
async onCancel() {
/** Called when user cancels — for long-running tasks */
return true;
}
}
Apps can expose multiple functions with different input/output schemas. Functions are auto-discovered.
Python: Add methods with type-hinted Pydantic input/output models. Node.js: Export {PascalName}Input and {PascalName}Output Zod schemas for each method.
Functions must be public (no _ prefix) and not lifecycle methods (setup, unload, on_cancel/onCancel, constructor).
Call via API with "function": "method_name" in the request body. Set default_function in inf.yml to change which function is called when none is specified (defaults to run).
Most CPU-only apps that wrap external APIs follow this pattern. Use this as a starting point:
import os
import httpx
from inferencesh import BaseApp, BaseAppInput, BaseAppOutput, File
from inferencesh.models.usage import OutputMeta, ImageMeta # or TextMeta, AudioMeta, etc.
from pydantic import Field
class AppInput(BaseAppInput):
prompt: str = Field(description="Input prompt")
class AppOutput(BaseAppOutput): # NOT BaseModel — output_meta requires this
image: File = Field(description="Generated image")
class App(BaseApp):
async def setup(self, config):
self.api_key = os.environ["API_KEY"]
self.client = httpx.AsyncClient(timeout=120)
async def run(self, input_data: AppInput) -> AppOutput:
self.logger.info(f"Calling API with prompt: {input_data.prompt[:80]}")
response = await self.client.post(
"https://api.example.com/generate",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"prompt": input_data.prompt},
)
response.raise_for_status()
# Write output file
output_path = "/tmp/output.png"
with open(output_path, "wb") as f:
f.write(response.content)
# Read actual dimensions (don't hardcode!)
from PIL import Image
with Image.open(output_path) as img:
width, height = img.size
self.logger.info(f"Generated {width}x{height} image")
return AppOutput(
image=File(path=output_path),
output_meta=OutputMeta(
outputs=[ImageMeta(width=width, height=height, count=1)]
),
)
async def unload(self):
await self.client.aclose()
Python:
my-app/
├── inf.yml # Configuration
├── inference.py # App logic
├── requirements.txt # Python packages (pip)
└── packages.txt # System packages (apt) — optional
Node.js:
my-app/
├── inf.yml # Configuration
├── src/
│ └── inference.js # App logic
├── package.json # Node.js packages (npm/pnpm)
└── packages.txt # System packages (apt) — optional
name: my-app
description: What my app does
category: image
kernel: python-3.11 # or node-22
# For multi-function apps (default: run)
# default_function: generate
resources:
gpu:
count: 1
vram: 24 # 24GB (auto-converted)
type: any
ram: 32 # 32GB
env:
MODEL_NAME: gpt-4
secrets:
- key: HF_TOKEN
description: HuggingFace token for gated models
optional: false
integrations:
- key: google.sheets
description: Access to Google Sheets
optional: true
CLI auto-converts human-friendly values:
80 = 80GB)any | nvidia | amd | apple | none
Note: Currently only NVIDIA CUDA GPUs are supported.
image | video | audio | text | chat | 3d | other
resources:
gpu:
count: 0
type: none
ram: 4
Python — requirements.txt:
torch>=2.0
transformers
accelerate
Node.js — package.json:
{
"type": "module",
"dependencies": {
"zod": "^3.23.0",
"sharp": "^0.33.0"
}
}
System packages — packages.txt (apt-installable):
ffmpeg
libgl1-mesa-glx
| Type | Image |
|---|---|
| GPU | docker.inference.sh/gpu:latest-cuda |
| CPU | docker.inference.sh/cpu:latest |
Load the appropriate reference file based on the language and topic:
Weekly Installs
116
Repository
GitHub Stars
235
First Seen
4 days ago
Security Audits
Gen Agent Trust HubFailSocketPassSnykFail
Installed on
gemini-cli103
codex102
opencode101
kimi-cli97
cursor97
amp97
Azure Data Explorer (Kusto) 查询技能:KQL数据分析、日志遥测与时间序列处理
138,800 周安装