toon-format by aradotso/trending-skills
npx skills add https://github.com/aradotso/trending-skills --skill toon-format由 ara.so 提供的技能 — Daily 2026 技能集。
TOON 是 JSON 数据模型的一种紧凑、人类可读的编码方式,旨在最小化大语言模型输入的令牌数量。它结合了用于嵌套对象的 YAML 风格缩进和用于统一数组的 CSV 风格表格布局,在保持或提高大语言模型理解准确性的同时,实现了约 40% 的令牌减少。
# npm
npm install @toon-format/toon
# pnpm
pnpm add @toon-format/toon
# yarn
yarn add @toon-format/toon
# 全局安装
npm install -g @toon-format/toon
# 将 JSON 文件转换为 TOON
toon encode input.json
toon encode input.json -o output.toon
# 将 TOON 转换回 JSON
toon decode input.toon
toon decode input.toon -o output.json
# 管道支持
cat data.json | toon encode
cat data.toon | toon decode
# 美化打印 JSON 输出
toon decode input.toon --pretty
# 显示令牌数量对比
toon encode input.json --stats
import { encode, decode } from '@toon-format/toon';
// 基本编码 (JSON → TOON 字符串)
const data = {
context: {
task: 'Our favorite hikes together',
location: 'Boulder',
season: 'spring_2025',
},
friends: ['ana', 'luis', 'sam'],
hikes: [
{ id: 1, name: 'Blue Lake Trail', distanceKm: 7.5, elevationGain: 320, companion: 'ana', wasSunny: true },
{ id: 2, name: 'Ridge Overlook', distanceKm: 9.2, elevationGain: 540, companion: 'luis', wasSunny: false },
{ id: 3, name: 'Wildflower Loop', distanceKm: 5.1, elevationGain: 180, companion: 'sam', wasSunny: true },
],
};
const toon = encode(data);
console.log(toon);
// context:
// task: Our favorite hikes together
// location: Boulder
// season: spring_2025
// friends[3]: ana,luis,sam
// hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
// 1,Blue Lake Trail,7.5,320,ana,true
// 2,Ridge Overlook,9.2,540,luis,false
// 3,Wildflower Loop,5.1,180,sam,true
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
import { decode } from '@toon-format/toon';
const toonString = `
context:
task: Our favorite hikes together
location: Boulder
friends[2]: ana,luis
hikes[2]{id,name,distanceKm}:
1,Blue Lake Trail,7.5
2,Ridge Overlook,9.2
`;
const parsed = decode(toonString);
// 返回原始的 JavaScript 对象
console.log(parsed.hikes[0].name); // 'Blue Lake Trail'
import { encode } from '@toon-format/toon';
const toon = encode(data, {
// 强制所有数组使用表格格式 (默认: 自动检测统一数组)
tabular: 'always',
// 从不使用表格格式
// tabular: 'never',
// 嵌套对象的缩进大小 (默认: 2)
indent: 2,
// 引用包含特殊字符的字符串 (默认: auto)
quoting: 'auto',
});
TOON 对标量的编码方式与 YAML 相同 — 在无歧义时不加引号:
name: Alice
age: 30
active: true
score: 98.6
nothing: null
user:
name: Alice
address:
city: Boulder
zip: 80301
方括号声明数组长度,值用逗号分隔:
tags[3]: typescript,llm,serialization
scores[4]: 10,20,30,40
花括号声明字段标题;后续每个缩进行都是一行数据:
employees[3]{id,name,department,salary}:
1,Alice,Engineering,95000
2,Bob,Marketing,72000
3,Carol,Engineering,102000
包含逗号、冒号或换行符的值会被引用:
notes[2]: "hello, world","line1\nline2"
messages[1]{from,text}:
alice,"See you at 3:00, okay?"
company:
name: Acme Corp
founded: 1987
offices[2]: NYC,SF
teams[2]{name,headcount}:
Engineering,45
Marketing,20
import { encode } from '@toon-format/toon';
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function queryWithToon(data: unknown, question: string) {
const toon = encode(data);
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: [
'You are a data analyst. The user will provide data in TOON format.',
'TOON is a compact encoding of JSON: indentation = nesting,',
'key[N]: v1,v2 = array of N scalars,',
'key[N]{f1,f2}: rows = array of N objects with fields f1, f2.',
].join(' '),
},
{
role: 'user',
content: `Data:\n\`\`\`\n${toon}\n\`\`\`\n\nQuestion: ${question}`,
},
],
});
return response.choices[0].message.content;
}
// 用法
const employees = [
{ id: 1, name: 'Alice', dept: 'Eng', salary: 95000 },
{ id: 2, name: 'Bob', dept: 'Marketing', salary: 72000 },
];
const answer = await queryWithToon(
{ employees },
'Who has the highest salary?'
);
import { encode } from '@toon-format/toon';
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
async function analyzeWithClaude(data: unknown, prompt: string) {
const toon = encode(data);
const message = await client.messages.create({
model: 'claude-haiku-4-5-20251001',
max_tokens: 1024,
system:
'Data is in TOON format: indented = nested objects, key[N]: vals = scalar array, key[N]{fields}: rows = object array.',
messages: [
{
role: 'user',
content: `\`\`\`toon\n${toon}\n\`\`\`\n\n${prompt}`,
},
],
});
return message.content[0].type === 'text' ? message.content[0].text : null;
}
import { encode } from '@toon-format/toon';
import { encode as gptEncode } from 'gpt-tokenizer';
function compareTokens(data: unknown) {
const jsonStr = JSON.stringify(data);
const toonStr = encode(data);
const jsonTokens = gptEncode(jsonStr).length;
const toonTokens = gptEncode(toonStr).length;
const savings = (((jsonTokens - toonTokens) / jsonTokens) * 100).toFixed(1);
console.log(`JSON: ${jsonTokens} tokens`);
console.log(`TOON: ${toonTokens} tokens`);
console.log(`Saved: ${savings}%`);
return { jsonTokens, toonTokens, savings: parseFloat(savings) };
}
import { encode } from '@toon-format/toon';
// 为独立的 LLM 调用分别编码每条记录
function encodeRecords<T>(records: T[]): string[] {
return records.map((r) => encode(r));
}
// 将所有记录编码为一个 TOON 文档 (对于批量处理最有效)
function encodeAll<T>(records: T[], key = 'records'): string {
return encode({ [key]: records });
}
import { encode } from '@toon-format/toon';
interface SearchResult {
id: string;
title: string;
snippet: string;
score: number;
url: string;
}
function buildRagContext(results: SearchResult[]): string {
// TOON 在这里非常理想 — 统一的对象可以折叠成紧凑的表格
return encode({ results });
}
// 输出:
// results[5]{id,title,snippet,score,url}:
// doc1,Introduction to TOON,...,0.95,https://...
// doc2,TOON vs JSON,...,0.87,https://...
import { encode } from '@toon-format/toon';
import { createReadStream, createWriteStream } from 'fs';
// 对于大型 JSON 文件: 读取 → 解析 → 编码 → 写入
async function convertFile(inputPath: string, outputPath: string) {
const raw = await fs.promises.readFile(inputPath, 'utf-8');
const data = JSON.parse(raw);
const toon = encode(data);
await fs.promises.writeFile(outputPath, toon, 'utf-8');
const jsonBytes = Buffer.byteLength(raw);
const toonBytes = Buffer.byteLength(toon);
console.log(`Reduced size by ${(((jsonBytes - toonBytes) / jsonBytes) * 100).toFixed(1)}%`);
}
import { encode, decode } from '@toon-format/toon';
interface Employee {
id: number;
name: string;
department: string;
salary: number;
active: boolean;
}
interface EmployeeReport {
generatedAt: string;
employees: Employee[];
}
// encode 支持泛型 — 可以传递任何可序列化的对象
const report: EmployeeReport = {
generatedAt: new Date().toISOString(),
employees: [
{ id: 1, name: 'Alice', department: 'Engineering', salary: 95000, active: true },
{ id: 2, name: 'Bob', department: 'Marketing', salary: 72000, active: true },
],
};
const toon = encode(report);
// 通过类型断言解码回来
const recovered = decode(toon) as EmployeeReport;
console.log(recovered.employees[0].name); // 'Alice'
import express from 'express';
import { encode, decode } from '@toon-format/toon';
const app = express();
// 解析传入的 TOON 请求体
app.use((req, res, next) => {
if (req.headers['content-type']?.startsWith('text/toon')) {
let body = '';
req.on('data', (chunk) => (body += chunk));
req.on('end', () => {
try {
(req as any).toonBody = decode(body);
next();
} catch (e) {
res.status(400).json({ error: 'Invalid TOON body' });
}
});
} else {
next();
}
});
// 当客户端请求时,用 TOON 格式响应
app.get('/api/employees', (req, res) => {
const employees = [
{ id: 1, name: 'Alice', dept: 'Eng' },
{ id: 2, name: 'Bob', dept: 'Marketing' },
];
if (req.headers.accept?.includes('text/toon')) {
res.setHeader('Content-Type', 'text/toon; charset=utf-8');
res.send(encode({ employees }));
} else {
res.json({ employees });
}
});
| 场景 | 建议 |
|---|---|
| 统一的对象数组 | ✅ TOON (节省最多) |
| 深度嵌套 / 非统一结构 | ⚠️ 两者都进行基准测试;JSON-compact 可能胜出 |
| 纯扁平表格数据 | 考虑 CSV (更小) 或 TOON (结构化) |
| 延迟关键 (本地模型) | 基准测试 TTFT + 令牌/秒 |
| 编程式 API 调用 | 保持 JSON;仅在 LLM 输入时编码为 TOON |
| 半统一结构 (~40–60% 表格) | 基准测试;节省效果会减弱 |
在你的 TOON 字符串中用双引号包裹它们,或者确保 encode() 自动处理:
// encode() 会自动引用包含逗号的值
const data = { tags: ['hello, world', 'foo,bar'] };
encode(data);
// tags[2]: "hello, world","foo,bar"
TOON 对数字和布尔值使用不加引号的值。在编码前确保你的数据使用正确的 JS 类型 — 当你意思是 95000 (数字) 时,不要传递 "95000" (字符串):
// ✅ 正确
{ salary: 95000, active: true }
// ❌ 将解码为字符串 "95000" 和字符串 "true"
{ salary: '95000', active: 'true' }
在你的系统提示中添加简短的 TOON 格式说明:
TOON 格式规则:
- 缩进 = 嵌套对象
- key[N]: v1,v2,v3 = N 个标量值的数组
- key[N]{field1,field2}: 后跟 N 个缩进行 = 对象数组
# 验证全局 bin 路径是否在你的 PATH 中
npm bin -g # 或: npm root -g
# 或者使用 npx
npx @toon-format/toon encode input.json
手写 TOON 中的常见错误:
items{id,name}: → 必须是 items[2]{id,name}:: 作为首字符的值未加引号每周安装量
227
仓库
GitHub 星标数
10
首次出现
6 天前
安全审计
安装于
github-copilot226
codex226
warp226
amp226
cline226
kimi-cli226
Skill by ara.so — Daily 2026 Skills collection.
TOON is a compact, human-readable encoding of the JSON data model that minimizes tokens for LLM input. It combines YAML-style indentation for nested objects with CSV-style tabular layout for uniform arrays, achieving ~40% token reduction while maintaining or improving LLM comprehension accuracy.
# npm
npm install @toon-format/toon
# pnpm
pnpm add @toon-format/toon
# yarn
yarn add @toon-format/toon
# Install globally
npm install -g @toon-format/toon
# Convert JSON file to TOON
toon encode input.json
toon encode input.json -o output.toon
# Convert TOON back to JSON
toon decode input.toon
toon decode input.toon -o output.json
# Pipe support
cat data.json | toon encode
cat data.toon | toon decode
# Pretty-print JSON output
toon decode input.toon --pretty
# Show token count comparison
toon encode input.json --stats
import { encode, decode } from '@toon-format/toon';
// Basic encoding (JSON → TOON string)
const data = {
context: {
task: 'Our favorite hikes together',
location: 'Boulder',
season: 'spring_2025',
},
friends: ['ana', 'luis', 'sam'],
hikes: [
{ id: 1, name: 'Blue Lake Trail', distanceKm: 7.5, elevationGain: 320, companion: 'ana', wasSunny: true },
{ id: 2, name: 'Ridge Overlook', distanceKm: 9.2, elevationGain: 540, companion: 'luis', wasSunny: false },
{ id: 3, name: 'Wildflower Loop', distanceKm: 5.1, elevationGain: 180, companion: 'sam', wasSunny: true },
],
};
const toon = encode(data);
console.log(toon);
// context:
// task: Our favorite hikes together
// location: Boulder
// season: spring_2025
// friends[3]: ana,luis,sam
// hikes[3]{id,name,distanceKm,elevationGain,companion,wasSunny}:
// 1,Blue Lake Trail,7.5,320,ana,true
// 2,Ridge Overlook,9.2,540,luis,false
// 3,Wildflower Loop,5.1,180,sam,true
import { decode } from '@toon-format/toon';
const toonString = `
context:
task: Our favorite hikes together
location: Boulder
friends[2]: ana,luis
hikes[2]{id,name,distanceKm}:
1,Blue Lake Trail,7.5
2,Ridge Overlook,9.2
`;
const parsed = decode(toonString);
// Returns the original JavaScript object
console.log(parsed.hikes[0].name); // 'Blue Lake Trail'
import { encode } from '@toon-format/toon';
const toon = encode(data, {
// Force all arrays to tabular format (default: auto-detect uniform arrays)
tabular: 'always',
// Never use tabular format
// tabular: 'never',
// Indent size for nested objects (default: 2)
indent: 2,
// Quote strings that contain special characters (default: auto)
quoting: 'auto',
});
TOON encodes scalars the same way as YAML — unquoted when unambiguous:
name: Alice
age: 30
active: true
score: 98.6
nothing: null
user:
name: Alice
address:
city: Boulder
zip: 80301
Square brackets declare the array length, values are comma-separated:
tags[3]: typescript,llm,serialization
scores[4]: 10,20,30,40
Curly braces declare the field headers; each subsequent indented line is a row:
employees[3]{id,name,department,salary}:
1,Alice,Engineering,95000
2,Bob,Marketing,72000
3,Carol,Engineering,102000
Values containing commas, colons, or newlines are quoted:
notes[2]: "hello, world","line1\nline2"
messages[1]{from,text}:
alice,"See you at 3:00, okay?"
company:
name: Acme Corp
founded: 1987
offices[2]: NYC,SF
teams[2]{name,headcount}:
Engineering,45
Marketing,20
import { encode } from '@toon-format/toon';
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function queryWithToon(data: unknown, question: string) {
const toon = encode(data);
const response = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: [
'You are a data analyst. The user will provide data in TOON format.',
'TOON is a compact encoding of JSON: indentation = nesting,',
'key[N]: v1,v2 = array of N scalars,',
'key[N]{f1,f2}: rows = array of N objects with fields f1, f2.',
].join(' '),
},
{
role: 'user',
content: `Data:\n\`\`\`\n${toon}\n\`\`\`\n\nQuestion: ${question}`,
},
],
});
return response.choices[0].message.content;
}
// Usage
const employees = [
{ id: 1, name: 'Alice', dept: 'Eng', salary: 95000 },
{ id: 2, name: 'Bob', dept: 'Marketing', salary: 72000 },
];
const answer = await queryWithToon(
{ employees },
'Who has the highest salary?'
);
import { encode } from '@toon-format/toon';
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
async function analyzeWithClaude(data: unknown, prompt: string) {
const toon = encode(data);
const message = await client.messages.create({
model: 'claude-haiku-4-5-20251001',
max_tokens: 1024,
system:
'Data is in TOON format: indented = nested objects, key[N]: vals = scalar array, key[N]{fields}: rows = object array.',
messages: [
{
role: 'user',
content: `\`\`\`toon\n${toon}\n\`\`\`\n\n${prompt}`,
},
],
});
return message.content[0].type === 'text' ? message.content[0].text : null;
}
import { encode } from '@toon-format/toon';
import { encode as gptEncode } from 'gpt-tokenizer';
function compareTokens(data: unknown) {
const jsonStr = JSON.stringify(data);
const toonStr = encode(data);
const jsonTokens = gptEncode(jsonStr).length;
const toonTokens = gptEncode(toonStr).length;
const savings = (((jsonTokens - toonTokens) / jsonTokens) * 100).toFixed(1);
console.log(`JSON: ${jsonTokens} tokens`);
console.log(`TOON: ${toonTokens} tokens`);
console.log(`Saved: ${savings}%`);
return { jsonTokens, toonTokens, savings: parseFloat(savings) };
}
import { encode } from '@toon-format/toon';
// Encode each record separately for independent LLM calls
function encodeRecords<T>(records: T[]): string[] {
return records.map((r) => encode(r));
}
// Encode all records as one TOON document (most efficient for bulk)
function encodeAll<T>(records: T[], key = 'records'): string {
return encode({ [key]: records });
}
import { encode } from '@toon-format/toon';
interface SearchResult {
id: string;
title: string;
snippet: string;
score: number;
url: string;
}
function buildRagContext(results: SearchResult[]): string {
// TOON is ideal here — uniform objects collapse into a compact table
return encode({ results });
}
// Output:
// results[5]{id,title,snippet,score,url}:
// doc1,Introduction to TOON,...,0.95,https://...
// doc2,TOON vs JSON,...,0.87,https://...
import { encode } from '@toon-format/toon';
import { createReadStream, createWriteStream } from 'fs';
// For large JSON files: read → parse → encode → write
async function convertFile(inputPath: string, outputPath: string) {
const raw = await fs.promises.readFile(inputPath, 'utf-8');
const data = JSON.parse(raw);
const toon = encode(data);
await fs.promises.writeFile(outputPath, toon, 'utf-8');
const jsonBytes = Buffer.byteLength(raw);
const toonBytes = Buffer.byteLength(toon);
console.log(`Reduced size by ${(((jsonBytes - toonBytes) / jsonBytes) * 100).toFixed(1)}%`);
}
import { encode, decode } from '@toon-format/toon';
interface Employee {
id: number;
name: string;
department: string;
salary: number;
active: boolean;
}
interface EmployeeReport {
generatedAt: string;
employees: Employee[];
}
// Encode is generic-friendly — pass any serializable object
const report: EmployeeReport = {
generatedAt: new Date().toISOString(),
employees: [
{ id: 1, name: 'Alice', department: 'Engineering', salary: 95000, active: true },
{ id: 2, name: 'Bob', department: 'Marketing', salary: 72000, active: true },
],
};
const toon = encode(report);
// Decode back with type assertion
const recovered = decode(toon) as EmployeeReport;
console.log(recovered.employees[0].name); // 'Alice'
import express from 'express';
import { encode, decode } from '@toon-format/toon';
const app = express();
// Parse incoming TOON bodies
app.use((req, res, next) => {
if (req.headers['content-type']?.startsWith('text/toon')) {
let body = '';
req.on('data', (chunk) => (body += chunk));
req.on('end', () => {
try {
(req as any).toonBody = decode(body);
next();
} catch (e) {
res.status(400).json({ error: 'Invalid TOON body' });
}
});
} else {
next();
}
});
// Respond with TOON when client requests it
app.get('/api/employees', (req, res) => {
const employees = [
{ id: 1, name: 'Alice', dept: 'Eng' },
{ id: 2, name: 'Bob', dept: 'Marketing' },
];
if (req.headers.accept?.includes('text/toon')) {
res.setHeader('Content-Type', 'text/toon; charset=utf-8');
res.send(encode({ employees }));
} else {
res.json({ employees });
}
});
| Scenario | Recommendation |
|---|---|
| Uniform arrays of objects | ✅ TOON (biggest savings) |
| Deeply nested / non-uniform | ⚠️ Benchmark both; JSON-compact may win |
| Pure flat tabular data | Consider CSV (smaller) or TOON (structured) |
| Latency-critical (local models) | Benchmark TTFT + tokens/sec |
| Programmatic API calls | Keep JSON; encode to TOON only for LLM input |
| Semi-uniform (~40–60% tabular) | Benchmark; savings diminish |
Wrap them in double quotes in your TOON string, or ensure encode() handles it automatically:
// encode() automatically quotes values containing commas
const data = { tags: ['hello, world', 'foo,bar'] };
encode(data);
// tags[2]: "hello, world","foo,bar"
TOON uses unquoted values for numbers and booleans. Ensure your data uses proper JS types before encoding — don't pass "95000" (string) when you mean 95000 (number):
// ✅ Correct
{ salary: 95000, active: true }
// ❌ Will decode as string "95000" and string "true"
{ salary: '95000', active: 'true' }
Add a brief TOON format explanation to your system prompt:
TOON format rules:
- Indentation = nested object
- key[N]: v1,v2,v3 = array of N scalar values
- key[N]{field1,field2}: followed by N indented rows = array of objects
# Verify global bin path is on your PATH
npm bin -g # or: npm root -g
# Alternatively use npx
npx @toon-format/toon encode input.json
Common mistakes in hand-written TOON:
items{id,name}: → must be items[2]{id,name}:: as first characterWeekly Installs
227
Repository
GitHub Stars
10
First Seen
6 days ago
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
github-copilot226
codex226
warp226
amp226
cline226
kimi-cli226
React 组合模式指南:Vercel 组件架构最佳实践,提升代码可维护性
107,800 周安装