site-architecture by alinaqi/claude-bootstrap
npx skills add https://github.com/alinaqi/claude-bootstrap --skill site-architecture加载方式:base.md + web-content.md
针对技术性网站结构,使其能够被搜索引擎和 AI 爬虫(GPTBot、ClaudeBot、PerplexityBot)发现。
内容是王道。架构是王国。
优秀的内容若埋没在糟糕的架构中,将无法被发现。本技能涵盖使您的内容能被以下对象发现的技术基础:
# robots.txt
# 默认允许所有爬虫
User-agent: *
Allow: /
Disallow: /api/
Disallow: /admin/
Disallow: /private/
Disallow: /_next/
Disallow: /cdn-cgi/
# 站点地图位置
Sitemap: https://yoursite.com/sitemap.xml
# 爬取延迟(可选 - 请谨慎,并非所有机器人都会遵守)
# Crawl-delay: 1
# 包含 AI 机器人规则的 robots.txt
# === 搜索引擎 ===
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
# === AI 助手(允许发现)===
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: Google-Extended
Allow: /
# === 阻止 AI 训练(可选 - 阻止训练,允许聊天)===
# 如果您希望被引用但不用于训练,请取消注释这些行
# User-agent: CCBot
# Disallow: /
# User-agent: GPTBot
# Disallow: / # 同时阻止聊天和训练
# === 阻止抓取工具 ===
User-agent: AhrefsBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: MJ12bot
Disallow: /
# === 默认规则 ===
User-agent: *
Allow: /
Disallow: /api/
Disallow: /admin/
Disallow: /auth/
Disallow: /private/
Disallow: /*.json$
Disallow: /*?*
Sitemap: https://yoursite.com/sitemap.xml
广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
// app/robots.ts
import { MetadataRoute } from 'next';
export default function robots(): MetadataRoute.Robots {
const baseUrl = process.env.NEXT_PUBLIC_URL || 'https://yoursite.com';
return {
rules: [
{
userAgent: '*',
allow: '/',
disallow: ['/api/', '/admin/', '/private/', '/_next/'],
},
{
userAgent: 'GPTBot',
allow: '/',
},
{
userAgent: 'ClaudeBot',
allow: '/',
},
{
userAgent: 'PerplexityBot',
allow: '/',
},
],
sitemap: `${baseUrl}/sitemap.xml`,
};
}
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://yoursite.com/</loc>
<lastmod>2025-01-15</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://yoursite.com/pricing</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
<url>
<loc>https://yoursite.com/blog/article-slug</loc>
<lastmod>2025-01-12</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
<image:image>
<image:loc>https://yoursite.com/images/article-image.jpg</image:loc>
</image:image>
</url>
</urlset>
// app/sitemap.ts
import { MetadataRoute } from 'next';
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const baseUrl = process.env.NEXT_PUBLIC_URL || 'https://yoursite.com';
// 静态页面
const staticPages = [
{ url: '/', priority: 1.0, changeFrequency: 'weekly' as const },
{ url: '/pricing', priority: 0.9, changeFrequency: 'monthly' as const },
{ url: '/about', priority: 0.8, changeFrequency: 'monthly' as const },
{ url: '/contact', priority: 0.7, changeFrequency: 'yearly' as const },
];
// 动态页面(例如,博客文章)
const posts = await getBlogPosts(); // 您的数据获取函数
const blogPages = posts.map((post) => ({
url: `/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'monthly' as const,
priority: 0.8,
}));
return [
...staticPages.map((page) => ({
url: `${baseUrl}${page.url}`,
lastModified: new Date(),
changeFrequency: page.changeFrequency,
priority: page.priority,
})),
...blogPages.map((page) => ({
url: `${baseUrl}${page.url}`,
lastModified: page.lastModified,
changeFrequency: page.changeFrequency,
priority: page.priority,
})),
];
}
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://yoursite.com/sitemap-pages.xml</loc>
<lastmod>2025-01-15</lastmod>
</sitemap>
<sitemap>
<loc>https://yoursite.com/sitemap-blog.xml</loc>
<lastmod>2025-01-14</lastmod>
</sitemap>
<sitemap>
<loc>https://yoursite.com/sitemap-products.xml</loc>
<lastmod>2025-01-13</lastmod>
</sitemap>
</sitemapindex>
<head>
<!-- 基础 -->
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>页面标题 | 品牌名称</title>
<meta name="description" content="引人注目的 150-160 字符描述,包含关键词和行动号召。">
<!-- 规范链接(防止重复内容) -->
<link rel="canonical" href="https://yoursite.com/current-page">
<!-- 语言 -->
<html lang="en">
<meta name="language" content="English">
<!-- 机器人 -->
<meta name="robots" content="index, follow">
<meta name="googlebot" content="index, follow">
<!-- 作者 -->
<meta name="author" content="作者姓名">
<!-- 网站图标 -->
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="icon" href="/icon.svg" type="image/svg+xml">
<link rel="apple-touch-icon" href="/apple-touch-icon.png">
<link rel="manifest" href="/manifest.webmanifest">
</head>
<!-- Open Graph / Facebook -->
<meta property="og:type" content="website">
<meta property="og:url" content="https://yoursite.com/page">
<meta property="og:title" content="页面标题 - 品牌">
<meta property="og:description" content="用于社交分享的描述(可以更长)。">
<meta property="og:image" content="https://yoursite.com/og-image.jpg">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">
<meta property="og:site_name" content="品牌名称">
<meta property="og:locale" content="en_US">
<!-- 文章专用(用于博客文章) -->
<meta property="og:type" content="article">
<meta property="article:published_time" content="2025-01-15T08:00:00Z">
<meta property="article:modified_time" content="2025-01-20T10:00:00Z">
<meta property="article:author" content="https://yoursite.com/team/author">
<meta property="article:section" content="Technology">
<meta property="article:tag" content="AI, SEO, Content">
<!-- Twitter -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@yourbrand">
<meta name="twitter:creator" content="@authorhandle">
<meta name="twitter:title" content="页面标题">
<meta name="twitter:description" content="用于 Twitter 的描述(最多 200 字符)。">
<meta name="twitter:image" content="https://yoursite.com/twitter-image.jpg">
// app/layout.tsx
import { Metadata } from 'next';
export const metadata: Metadata = {
metadataBase: new URL('https://yoursite.com'),
title: {
default: '品牌名称',
template: '%s | 品牌名称',
},
description: '您的默认站点描述。',
keywords: ['keyword1', 'keyword2', 'keyword3'],
authors: [{ name: '品牌名称', url: 'https://yoursite.com' }],
creator: '品牌名称',
publisher: '品牌名称',
robots: {
index: true,
follow: true,
googleBot: {
index: true,
follow: true,
'max-video-preview': -1,
'max-image-preview': 'large',
'max-snippet': -1,
},
},
openGraph: {
type: 'website',
locale: 'en_US',
url: 'https://yoursite.com',
siteName: '品牌名称',
title: '品牌名称',
description: '您的站点描述。',
images: [
{
url: '/og-image.jpg',
width: 1200,
height: 630,
alt: '品牌名称',
},
],
},
twitter: {
card: 'summary_large_image',
site: '@yourbrand',
creator: '@yourbrand',
},
verification: {
google: 'google-verification-code',
yandex: 'yandex-verification-code',
},
};
// app/blog/[slug]/page.tsx
export async function generateMetadata({ params }): Promise<Metadata> {
const post = await getPost(params.slug);
return {
title: post.title,
description: post.excerpt,
openGraph: {
title: post.title,
description: post.excerpt,
type: 'article',
publishedTime: post.publishedAt,
modifiedTime: post.updatedAt,
authors: [post.author.name],
images: [post.coverImage],
},
};
}
✅ 好的 URL:
/blog/ai-seo-best-practices
/products/pro-plan
/pricing
/about/team
❌ 差的 URL:
/blog?id=123
/p/12345
/index.php?page=about
/Products/Pro_Plan(大小写不一致)
| 规则 | 示例 |
|---|---|
| 仅使用小写 | /blog/my-post 而非 /Blog/My-Post |
| 使用连字符而非下划线 | /my-page 而非 /my_page |
| 无尾部斜杠 | /about 而非 /about/ |
| 描述性短链接 | /pricing 而非 /p |
| 内容不使用查询参数 | /blog/post-title 而非 /blog?id=123 |
| 最多 3-4 级深度 | /blog/category/post |
// next.config.js
module.exports = {
async redirects() {
return [
// 将旧 URL 重定向到新 URL
{
source: '/old-page',
destination: '/new-page',
permanent: true, // 301 重定向
},
// 使用通配符的重定向
{
source: '/blog/old/:slug',
destination: '/articles/:slug',
permanent: true,
},
// 尾部斜杠重定向
{
source: '/:path+/',
destination: '/:path+',
permanent: true,
},
];
},
};
<!-- 始终包含规范链接,即使是主 URL 也要包含 -->
<link rel="canonical" href="https://yoursite.com/current-page">
✅ 使用规范链接:
- 每个页面(即使只有一个版本存在)
- 分页内容(指向第 1 页或使用 rel=prev/next)
- 不改变内容的 URL 参数(?utm_source=...)
- HTTP 与 HTTPS(规范链接指向 HTTPS)
- www 与非 www(选择一个,规范链接指向它)
示例:/products?sort=price 的规范链接应指向 /products
// 在元数据中自动生成
export const metadata: Metadata = {
alternates: {
canonical: '/current-page',
},
};
// next.config.js
const securityHeaders = [
{
key: 'X-DNS-Prefetch-Control',
value: 'on',
},
{
key: 'Strict-Transport-Security',
value: 'max-age=63072000; includeSubDomains; preload',
},
{
key: 'X-Frame-Options',
value: 'SAMEORIGIN',
},
{
key: 'X-Content-Type-Options',
value: 'nosniff',
},
{
key: 'Referrer-Policy',
value: 'strict-origin-when-cross-origin',
},
{
key: 'Permissions-Policy',
value: 'camera=(), microphone=(), geolocation=()',
},
];
module.exports = {
async headers() {
return [
{
source: '/:path*',
headers: securityHeaders,
},
];
},
};
| 指标 | 良好 | 需要改进 | 差 |
|---|---|---|---|
| LCP(最大内容绘制) | ≤2.5秒 | ≤4.0秒 | >4.0秒 |
| INP(下次绘制交互) | ≤200毫秒 | ≤500毫秒 | >500毫秒 |
| CLS(累积布局偏移) | ≤0.1 | ≤0.25 | >0.25 |
## LCP(加载)
- [ ] 优化最大图像(WebP,适当尺寸)
- [ ] 预加载关键资源
- [ ] 对静态资源使用 CDN
- [ ] 启用压缩(gzip/brotli)
- [ ] 最小化渲染阻塞资源
## INP(交互性)
- [ ] 最小化 JavaScript 执行时间
- [ ] 拆分长任务
- [ ] 对繁重计算使用 Web Workers
- [ ] 优化事件处理程序
- [ ] 延迟加载非关键 JS
## CLS(视觉稳定性)
- [ ] 为图像/视频设置尺寸
- [ ] 为动态内容预留空间
- [ ] 避免在现有内容上方插入内容
- [ ] 对动画使用 transform
- [ ] 预加载字体
// 图像优化
import Image from 'next/image';
<Image
src="/hero.jpg"
alt="Hero image"
width={1200}
height={630}
priority // 为 LCP 预加载
placeholder="blur"
blurDataURL={blurDataUrl}
/>
// 字体优化
import { Inter } from 'next/font/google';
const inter = Inter({
subsets: ['latin'],
display: 'swap', // 防止 FOIT
});
// 动态导入
import dynamic from 'next/dynamic';
const HeavyComponent = dynamic(() => import('./HeavyComponent'), {
loading: () => <Skeleton />,
ssr: false, // 仅在客户端需要时
});
## 链接架构
首页
├── /pricing(1 次点击)
├── /features(1 次点击)
├── /blog(1 次点击)
│ ├── /blog/category-1(2 次点击)
│ │ └── /blog/category-1/post(3 次点击)
│ └── /blog/category-2(2 次点击)
└── /about(1 次点击)
规则:每个页面距离首页不超过 3 次点击
✅ 应该做:
- 使用描述性锚文本
- 在内容中上下文链接
- 为主题创建中心页面
- 在文章末尾链接相关内容
- 使用面包屑导航
❌ 应避免:
- 使用“点击此处”作为锚文本
- 孤立页面(无内部链接)
- 每页链接过多(>100)
- 损坏的内部链接
- 重定向链
// components/Breadcrumbs.tsx
import Link from 'next/link';
interface BreadcrumbItem {
name: string;
href: string;
}
export function Breadcrumbs({ items }: { items: BreadcrumbItem[] }) {
const jsonLd = {
'@context': 'https://schema.org',
'@type': 'BreadcrumbList',
itemListElement: items.map((item, index) => ({
'@type': 'ListItem',
position: index + 1,
name: item.name,
item: `https://yoursite.com${item.href}`,
})),
};
return (
<>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(jsonLd) }}
/>
<nav aria-label="Breadcrumb">
<ol className="flex gap-2">
{items.map((item, index) => (
<li key={item.href}>
{index > 0 && <span>/</span>}
<Link href={item.href}>{item.name}</Link>
</li>
))}
</ol>
</nav>
</>
);
}
| 爬虫 | 用户代理 | 用途 |
|---|---|---|
| GPTBot | GPTBot | ChatGPT 网页浏览 |
| ChatGPT-User | ChatGPT-User | ChatGPT 用户浏览 |
| ClaudeBot | ClaudeBot | Claude 网页访问 |
| Claude-Web | Claude-Web | Claude 网页功能 |
| PerplexityBot | PerplexityBot | Perplexity 搜索 |
| Google-Extended | Google-Extended | Gemini/Bard 训练 |
| Amazonbot | Amazonbot | Alexa/Amazon AI |
| CCBot | CCBot | Common Crawl(AI 训练) |
# robots.txt
# 允许 GPTBot 用于 ChatGPT 浏览
User-agent: GPTBot
Allow: /
# 阻止 CCBot(用于训练数据集)
User-agent: CCBot
Disallow: /
# 阻止 Google AI 训练,允许搜索
User-agent: Google-Extended
Disallow: /
<!-- 阻止 AI 训练但允许索引 -->
<meta name="robots" content="index, follow, max-image-preview:large">
<!-- 选择退出 AI 训练(提议的标准) -->
<meta name="ai-training" content="disallow">
<!-- 选项 1:在 <head> 中使用 JSON-LD(推荐) -->
<head>
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "您的公司"
}
</script>
</head>
<!-- 选项 2:在 </body> 关闭之前 -->
<body>
<!-- 页面内容 -->
<script type="application/ld+json">
{ "@context": "https://schema.org", ... }
</script>
</body>
<head>
<!-- 组织(全站通用) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "Organization", ... }
</script>
<!-- 面包屑列表(导航) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "BreadcrumbList", ... }
</script>
<!-- 文章(页面专用) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "Article", ... }
</script>
<!-- 常见问题页面(如果存在 FAQ 部分) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "FAQPage", ... }
</script>
</head>
project/
├── public/
│ ├── robots.txt # 或动态生成
│ ├── sitemap.xml # 或动态生成
│ ├── favicon.ico
│ ├── icon.svg
│ ├── apple-touch-icon.png
│ ├── og-image.jpg # 默认 OG 图像(1200x630)
│ └── manifest.webmanifest
├── app/
│ ├── layout.tsx # 全局元数据
│ ├── robots.ts # 动态 robots.txt
│ ├── sitemap.ts # 动态站点地图
│ └── [page]/
│ └── page.tsx # 页面专用元数据
├── components/
│ ├── SchemaMarkup.tsx
│ ├── Breadcrumbs.tsx
│ └── MetaTags.tsx
└── lib/
├── schema.ts # Schema 生成器
└── seo.ts # SEO 工具
# 验证所有权的方法
1. HTML 文件上传(将 google*.html 上传到 public/)
2. 元标签(添加到 <head>)
3. DNS TXT 记录
4. Google Analytics(如果已安装)
1. Google Search Console
- 站点地图 → 添加新站点地图 → yoursite.com/sitemap.xml
2. Bing Webmaster Tools
- 站点地图 → 提交站点地图
3. Yandex Webmaster(如果相关)
- 索引 → 站点地图文件
## 技术 SEO 清单
### robots.txt
- [ ] 允许搜索引擎
- [ ] 允许 AI 机器人(GPTBot、ClaudeBot、PerplexityBot)
- [ ] 阻止管理/私有区域
- [ ] 包含站点地图引用
- [ ] 使用 Google 的 robots.txt 测试工具测试
### 站点地图
- [ ] 包含所有可索引页面
- [ ] 排除 noindex 页面
- [ ] 包含最后修改日期
- [ ] 提交到 Search Console
- [ ] 内容更改时自动更新
### 元标签
- [ ] 每个页面有唯一标题(50-60 字符)
- [ ] 每个页面有唯一描述(150-160 字符)
- [ ] 每个页面都有规范 URL
- [ ] Open Graph 标签
- [ ] Twitter Card 标签
### URL 结构
- [ ] 小写,连字符分隔
- [ ] 描述性短链接
- [ ] 内容不使用查询参数
- [ ] 对移动内容使用 301 重定向
- [ ] 无损坏链接
### 性能
- [ ] LCP < 2.5秒
- [ ] INP < 200毫秒
- [ ] CLS < 0.1
- [ ] 启用 HTTPS
- [ ] 配置安全标头
### 结构化数据
- [ ] 组织架构(首页)
- [ ] 面包屑列表(所有页面)
- [ ] 文章架构(博客文章)
- [ ] 常见问题架构(FAQ 部分)
- [ ] 使用富媒体搜索结果测试验证
public/
├── robots.txt ✓ 必需
├── sitemap.xml ✓ 必需
├── favicon.ico ✓ 必需
├── og-image.jpg ✓ 必需(1200x630)
└── manifest.json ○ 推荐
| 标签 | 长度 |
|---|---|
| 标题 | 50-60 字符 |
| 描述 | 150-160 字符 |
| OG 标题 | 60-90 字符 |
| OG 描述 | 200 字符 |
| Twitter 描述 | 200 字符 |
| 图像 | 尺寸 |
|---|---|
| OG 图像 | 1200 x 630 |
| Twitter 图像 | 1200 x 628 |
| 网站图标 | 32 x 32 |
| Apple Touch 图标 | 180 x 180 |
每周安装次数
102
仓库
GitHub 星标数
530
首次出现
2026年1月20日
安全审计
安装于
opencode83
gemini-cli81
claude-code77
codex77
cursor71
github-copilot66
Load with: base.md + web-content.md
For technical website structure that enables discovery by search engines AND AI crawlers (GPTBot, ClaudeBot, PerplexityBot).
Content is king. Architecture is the kingdom.
Great content buried in poor architecture won't be discovered. This skill covers the technical foundation that makes your content findable by:
# robots.txt
# Allow all crawlers by default
User-agent: *
Allow: /
Disallow: /api/
Disallow: /admin/
Disallow: /private/
Disallow: /_next/
Disallow: /cdn-cgi/
# Sitemap location
Sitemap: https://yoursite.com/sitemap.xml
# Crawl delay (optional - be careful, not all bots respect this)
# Crawl-delay: 1
# robots.txt with AI bot rules
# === SEARCH ENGINES ===
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
# === AI ASSISTANTS (Allow for discovery) ===
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Claude-Web
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Amazonbot
Allow: /
User-agent: anthropic-ai
Allow: /
User-agent: Google-Extended
Allow: /
# === BLOCK AI TRAINING (Optional - block training, allow chat) ===
# Uncomment these if you want to be cited but not used for training
# User-agent: CCBot
# Disallow: /
# User-agent: GPTBot
# Disallow: / # Blocks both chat and training
# === BLOCK SCRAPERS ===
User-agent: AhrefsBot
Disallow: /
User-agent: SemrushBot
Disallow: /
User-agent: MJ12bot
Disallow: /
# === DEFAULT ===
User-agent: *
Allow: /
Disallow: /api/
Disallow: /admin/
Disallow: /auth/
Disallow: /private/
Disallow: /*.json$
Disallow: /*?*
Sitemap: https://yoursite.com/sitemap.xml
// app/robots.ts
import { MetadataRoute } from 'next';
export default function robots(): MetadataRoute.Robots {
const baseUrl = process.env.NEXT_PUBLIC_URL || 'https://yoursite.com';
return {
rules: [
{
userAgent: '*',
allow: '/',
disallow: ['/api/', '/admin/', '/private/', '/_next/'],
},
{
userAgent: 'GPTBot',
allow: '/',
},
{
userAgent: 'ClaudeBot',
allow: '/',
},
{
userAgent: 'PerplexityBot',
allow: '/',
},
],
sitemap: `${baseUrl}/sitemap.xml`,
};
}
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<url>
<loc>https://yoursite.com/</loc>
<lastmod>2025-01-15</lastmod>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://yoursite.com/pricing</loc>
<lastmod>2025-01-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.9</priority>
</url>
<url>
<loc>https://yoursite.com/blog/article-slug</loc>
<lastmod>2025-01-12</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
<image:image>
<image:loc>https://yoursite.com/images/article-image.jpg</image:loc>
</image:image>
</url>
</urlset>
// app/sitemap.ts
import { MetadataRoute } from 'next';
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const baseUrl = process.env.NEXT_PUBLIC_URL || 'https://yoursite.com';
// Static pages
const staticPages = [
{ url: '/', priority: 1.0, changeFrequency: 'weekly' as const },
{ url: '/pricing', priority: 0.9, changeFrequency: 'monthly' as const },
{ url: '/about', priority: 0.8, changeFrequency: 'monthly' as const },
{ url: '/contact', priority: 0.7, changeFrequency: 'yearly' as const },
];
// Dynamic pages (e.g., blog posts)
const posts = await getBlogPosts(); // Your data fetching function
const blogPages = posts.map((post) => ({
url: `/blog/${post.slug}`,
lastModified: new Date(post.updatedAt),
changeFrequency: 'monthly' as const,
priority: 0.8,
}));
return [
...staticPages.map((page) => ({
url: `${baseUrl}${page.url}`,
lastModified: new Date(),
changeFrequency: page.changeFrequency,
priority: page.priority,
})),
...blogPages.map((page) => ({
url: `${baseUrl}${page.url}`,
lastModified: page.lastModified,
changeFrequency: page.changeFrequency,
priority: page.priority,
})),
];
}
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://yoursite.com/sitemap-pages.xml</loc>
<lastmod>2025-01-15</lastmod>
</sitemap>
<sitemap>
<loc>https://yoursite.com/sitemap-blog.xml</loc>
<lastmod>2025-01-14</lastmod>
</sitemap>
<sitemap>
<loc>https://yoursite.com/sitemap-products.xml</loc>
<lastmod>2025-01-13</lastmod>
</sitemap>
</sitemapindex>
<head>
<!-- Basic -->
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Page Title | Brand Name</title>
<meta name="description" content="Compelling 150-160 character description with keywords and CTA.">
<!-- Canonical (prevent duplicate content) -->
<link rel="canonical" href="https://yoursite.com/current-page">
<!-- Language -->
<html lang="en">
<meta name="language" content="English">
<!-- Robots -->
<meta name="robots" content="index, follow">
<meta name="googlebot" content="index, follow">
<!-- Author -->
<meta name="author" content="Author Name">
<!-- Favicon -->
<link rel="icon" href="/favicon.ico" sizes="any">
<link rel="icon" href="/icon.svg" type="image/svg+xml">
<link rel="apple-touch-icon" href="/apple-touch-icon.png">
<link rel="manifest" href="/manifest.webmanifest">
</head>
<!-- Open Graph / Facebook -->
<meta property="og:type" content="website">
<meta property="og:url" content="https://yoursite.com/page">
<meta property="og:title" content="Page Title - Brand">
<meta property="og:description" content="Description for social sharing (can be longer).">
<meta property="og:image" content="https://yoursite.com/og-image.jpg">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">
<meta property="og:site_name" content="Brand Name">
<meta property="og:locale" content="en_US">
<!-- Article-specific (for blog posts) -->
<meta property="og:type" content="article">
<meta property="article:published_time" content="2025-01-15T08:00:00Z">
<meta property="article:modified_time" content="2025-01-20T10:00:00Z">
<meta property="article:author" content="https://yoursite.com/team/author">
<meta property="article:section" content="Technology">
<meta property="article:tag" content="AI, SEO, Content">
<!-- Twitter -->
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@yourbrand">
<meta name="twitter:creator" content="@authorhandle">
<meta name="twitter:title" content="Page Title">
<meta name="twitter:description" content="Description for Twitter (max 200 chars).">
<meta name="twitter:image" content="https://yoursite.com/twitter-image.jpg">
// app/layout.tsx
import { Metadata } from 'next';
export const metadata: Metadata = {
metadataBase: new URL('https://yoursite.com'),
title: {
default: 'Brand Name',
template: '%s | Brand Name',
},
description: 'Your default site description.',
keywords: ['keyword1', 'keyword2', 'keyword3'],
authors: [{ name: 'Brand Name', url: 'https://yoursite.com' }],
creator: 'Brand Name',
publisher: 'Brand Name',
robots: {
index: true,
follow: true,
googleBot: {
index: true,
follow: true,
'max-video-preview': -1,
'max-image-preview': 'large',
'max-snippet': -1,
},
},
openGraph: {
type: 'website',
locale: 'en_US',
url: 'https://yoursite.com',
siteName: 'Brand Name',
title: 'Brand Name',
description: 'Your site description.',
images: [
{
url: '/og-image.jpg',
width: 1200,
height: 630,
alt: 'Brand Name',
},
],
},
twitter: {
card: 'summary_large_image',
site: '@yourbrand',
creator: '@yourbrand',
},
verification: {
google: 'google-verification-code',
yandex: 'yandex-verification-code',
},
};
// app/blog/[slug]/page.tsx
export async function generateMetadata({ params }): Promise<Metadata> {
const post = await getPost(params.slug);
return {
title: post.title,
description: post.excerpt,
openGraph: {
title: post.title,
description: post.excerpt,
type: 'article',
publishedTime: post.publishedAt,
modifiedTime: post.updatedAt,
authors: [post.author.name],
images: [post.coverImage],
},
};
}
✅ GOOD URLs:
/blog/ai-seo-best-practices
/products/pro-plan
/pricing
/about/team
❌ BAD URLs:
/blog?id=123
/p/12345
/index.php?page=about
/Products/Pro_Plan (inconsistent casing)
| Rule | Example |
|---|---|
| Lowercase only | /blog/my-post not /Blog/My-Post |
| Hyphens not underscores | /my-page not /my_page |
| No trailing slashes | /about not /about/ |
| Descriptive slugs | /pricing not /p |
// next.config.js
module.exports = {
async redirects() {
return [
// Redirect old URLs to new
{
source: '/old-page',
destination: '/new-page',
permanent: true, // 301 redirect
},
// Redirect with wildcard
{
source: '/blog/old/:slug',
destination: '/articles/:slug',
permanent: true,
},
// Trailing slash redirect
{
source: '/:path+/',
destination: '/:path+',
permanent: true,
},
];
},
};
<!-- Always include canonical, even for primary URL -->
<link rel="canonical" href="https://yoursite.com/current-page">
✅ USE CANONICAL:
- Every page (even if only version exists)
- Paginated content (point to page 1 or use rel=prev/next)
- URL parameters that don't change content (?utm_source=...)
- HTTP vs HTTPS (canonical to HTTPS)
- www vs non-www (pick one, canonical to it)
Example: /products?sort=price should canonical to /products
// Automatic in metadata
export const metadata: Metadata = {
alternates: {
canonical: '/current-page',
},
};
// next.config.js
const securityHeaders = [
{
key: 'X-DNS-Prefetch-Control',
value: 'on',
},
{
key: 'Strict-Transport-Security',
value: 'max-age=63072000; includeSubDomains; preload',
},
{
key: 'X-Frame-Options',
value: 'SAMEORIGIN',
},
{
key: 'X-Content-Type-Options',
value: 'nosniff',
},
{
key: 'Referrer-Policy',
value: 'strict-origin-when-cross-origin',
},
{
key: 'Permissions-Policy',
value: 'camera=(), microphone=(), geolocation=()',
},
];
module.exports = {
async headers() {
return [
{
source: '/:path*',
headers: securityHeaders,
},
];
},
};
| Metric | Good | Needs Improvement | Poor |
|---|---|---|---|
| LCP (Largest Contentful Paint) | ≤2.5s | ≤4.0s | >4.0s |
| INP (Interaction to Next Paint) | ≤200ms | ≤500ms | >500ms |
| CLS (Cumulative Layout Shift) | ≤0.1 | ≤0.25 | >0.25 |
## LCP (Loading)
- [ ] Optimize largest image (WebP, proper sizing)
- [ ] Preload critical assets
- [ ] Use CDN for static assets
- [ ] Enable compression (gzip/brotli)
- [ ] Minimize render-blocking resources
## INP (Interactivity)
- [ ] Minimize JavaScript execution time
- [ ] Break up long tasks
- [ ] Use web workers for heavy computation
- [ ] Optimize event handlers
- [ ] Lazy load non-critical JS
## CLS (Visual Stability)
- [ ] Set dimensions on images/videos
- [ ] Reserve space for dynamic content
- [ ] Avoid inserting content above existing
- [ ] Use transform for animations
- [ ] Preload fonts
// Image optimization
import Image from 'next/image';
<Image
src="/hero.jpg"
alt="Hero image"
width={1200}
height={630}
priority // Preload for LCP
placeholder="blur"
blurDataURL={blurDataUrl}
/>
// Font optimization
import { Inter } from 'next/font/google';
const inter = Inter({
subsets: ['latin'],
display: 'swap', // Prevent FOIT
});
// Dynamic imports
import dynamic from 'next/dynamic';
const HeavyComponent = dynamic(() => import('./HeavyComponent'), {
loading: () => <Skeleton />,
ssr: false, // Client-only if needed
});
## Link Architecture
Homepage
├── /pricing (1 click)
├── /features (1 click)
├── /blog (1 click)
│ ├── /blog/category-1 (2 clicks)
│ │ └── /blog/category-1/post (3 clicks)
│ └── /blog/category-2 (2 clicks)
└── /about (1 click)
Rule: Every page within 3 clicks of homepage
✅ DO:
- Use descriptive anchor text
- Link contextually within content
- Create hub pages for topics
- Link to related content at end of posts
- Use breadcrumbs for navigation
❌ AVOID:
- "Click here" as anchor text
- Orphan pages (no internal links)
- Too many links per page (>100)
- Broken internal links
- Redirect chains
// components/Breadcrumbs.tsx
import Link from 'next/link';
interface BreadcrumbItem {
name: string;
href: string;
}
export function Breadcrumbs({ items }: { items: BreadcrumbItem[] }) {
const jsonLd = {
'@context': 'https://schema.org',
'@type': 'BreadcrumbList',
itemListElement: items.map((item, index) => ({
'@type': 'ListItem',
position: index + 1,
name: item.name,
item: `https://yoursite.com${item.href}`,
})),
};
return (
<>
<script
type="application/ld+json"
dangerouslySetInnerHTML={{ __html: JSON.stringify(jsonLd) }}
/>
<nav aria-label="Breadcrumb">
<ol className="flex gap-2">
{items.map((item, index) => (
<li key={item.href}>
{index > 0 && <span>/</span>}
<Link href={item.href}>{item.name}</Link>
</li>
))}
</ol>
</nav>
</>
);
}
| Bot | User Agent | Purpose |
|---|---|---|
| GPTBot | GPTBot | ChatGPT web browsing |
| ChatGPT-User | ChatGPT-User | ChatGPT user browsing |
| ClaudeBot | ClaudeBot | Claude web access |
| Claude-Web | Claude-Web | Claude web features |
| PerplexityBot | PerplexityBot | Perplexity search |
# robots.txt
# Allow GPTBot for ChatGPT browsing
User-agent: GPTBot
Allow: /
# Block CCBot (used for training datasets)
User-agent: CCBot
Disallow: /
# Block Google AI training, allow search
User-agent: Google-Extended
Disallow: /
<!-- Block AI training but allow indexing -->
<meta name="robots" content="index, follow, max-image-preview:large">
<!-- Opt out of AI training (proposed standard) -->
<meta name="ai-training" content="disallow">
<!-- Option 1: In <head> with JSON-LD (recommended) -->
<head>
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Your Company"
}
</script>
</head>
<!-- Option 2: Before closing </body> -->
<body>
<!-- Page content -->
<script type="application/ld+json">
{ "@context": "https://schema.org", ... }
</script>
</body>
<head>
<!-- Organization (site-wide) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "Organization", ... }
</script>
<!-- BreadcrumbList (navigation) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "BreadcrumbList", ... }
</script>
<!-- Article (page-specific) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "Article", ... }
</script>
<!-- FAQPage (if FAQ section exists) -->
<script type="application/ld+json">
{ "@context": "https://schema.org", "@type": "FAQPage", ... }
</script>
</head>
project/
├── public/
│ ├── robots.txt # Or generate dynamically
│ ├── sitemap.xml # Or generate dynamically
│ ├── favicon.ico
│ ├── icon.svg
│ ├── apple-touch-icon.png
│ ├── og-image.jpg # Default OG image (1200x630)
│ └── manifest.webmanifest
├── app/
│ ├── layout.tsx # Global metadata
│ ├── robots.ts # Dynamic robots.txt
│ ├── sitemap.ts # Dynamic sitemap
│ └── [page]/
│ └── page.tsx # Page-specific metadata
├── components/
│ ├── SchemaMarkup.tsx
│ ├── Breadcrumbs.tsx
│ └── MetaTags.tsx
└── lib/
├── schema.ts # Schema generators
└── seo.ts # SEO utilities
# Verify ownership methods
1. HTML file upload (google*.html to public/)
2. Meta tag (add to <head>)
3. DNS TXT record
4. Google Analytics (if already installed)
1. Google Search Console
- Sitemaps → Add new sitemap → yoursite.com/sitemap.xml
2. Bing Webmaster Tools
- Sitemaps → Submit sitemap
3. Yandex Webmaster (if relevant)
- Indexing → Sitemap files
## Technical SEO Checklist
### robots.txt
- [ ] Allow search engines
- [ ] Allow AI bots (GPTBot, ClaudeBot, PerplexityBot)
- [ ] Block admin/private areas
- [ ] Include sitemap reference
- [ ] Test with Google's robots.txt tester
### Sitemap
- [ ] Include all indexable pages
- [ ] Exclude noindex pages
- [ ] Include lastmod dates
- [ ] Submit to Search Console
- [ ] Auto-update on content changes
### Meta Tags
- [ ] Unique title per page (50-60 chars)
- [ ] Unique description per page (150-160 chars)
- [ ] Canonical URL on every page
- [ ] Open Graph tags
- [ ] Twitter Card tags
### URL Structure
- [ ] Lowercase, hyphenated
- [ ] Descriptive slugs
- [ ] No query params for content
- [ ] 301 redirects for moved content
- [ ] No broken links
### Performance
- [ ] LCP < 2.5s
- [ ] INP < 200ms
- [ ] CLS < 0.1
- [ ] HTTPS enabled
- [ ] Security headers configured
### Structured Data
- [ ] Organization schema (homepage)
- [ ] BreadcrumbList (all pages)
- [ ] Article schema (blog posts)
- [ ] FAQ schema (FAQ sections)
- [ ] Validate with Rich Results Test
public/
├── robots.txt ✓ Required
├── sitemap.xml ✓ Required
├── favicon.ico ✓ Required
├── og-image.jpg ✓ Required (1200x630)
└── manifest.json ○ Recommended
| Tag | Length |
|---|---|
| Title | 50-60 characters |
| Description | 150-160 characters |
| OG Title | 60-90 characters |
| OG Description | 200 characters |
| Twitter Description | 200 characters |
| Image | Dimensions |
|---|---|
| OG Image | 1200 x 630 |
| Twitter Image | 1200 x 628 |
| Favicon | 32 x 32 |
| Apple Touch Icon | 180 x 180 |
Weekly Installs
102
Repository
GitHub Stars
530
First Seen
Jan 20, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
opencode83
gemini-cli81
claude-code77
codex77
cursor71
github-copilot66
Schema标记专家指南:结构化数据实现与SEO优化,提升富媒体搜索结果
31,400 周安装
Genkit Dart AI SDK 开发指南:Dart 代码生成、AI 智能体与结构化输出
10,700 周安装
Genkit JS 开发指南:AI 应用构建与错误排查全攻略
10,700 周安装
Firestore Standard Edition 完整入门指南 - 配置、安全规则、SDK 使用与索引优化
11,200 周安装
后端架构模式:整洁架构、六边形架构与领域驱动设计实战指南
11,000 周安装
Expo DOM组件教程:在React Native应用中无缝运行网页库和代码
11,200 周安装
Google Workspace CLI 日历事件创建命令 - gws calendar +insert 完整使用指南
11,500 周安装
| No query params for content | /blog/post-title not /blog?id=123 |
| Max 3-4 levels deep | /blog/category/post |
| Google-Extended | Google-Extended | Gemini/Bard training |
| Amazonbot | Amazonbot | Alexa/Amazon AI |
| CCBot | CCBot | Common Crawl (AI training) |