npx skills add https://github.com/chroma-core/agent-skills --skill chroma@chroma-core/default-embed (TypeScript) 或内置模型 (Python)
* OpenAI:text-embedding-3-large 最受欢迎,需要 @chroma-core/openai
* 询问用户是否有偏好或现有的提供商collection.query()广告位招租
在这里展示您的产品或服务
触达数万 AI 开发者,精准高效
Schema() + Search() APIcollection.query()首先提问:
使用合理的默认值继续:
getOrCreateCollection() / get_or_create_collection()CloudClient 与 Client)Schema 和 Search API 仅与 Cloud 一起使用get_or_create_collection() 接受 embedding_function 或 schema,但不能同时接受两者。当需要多个索引(混合搜索)或稀疏嵌入时使用 Schema;对于简单的仅密集搜索使用 embedding_function。要开始使用 Chroma Cloud,请使用 CLI 登录、创建数据库并将凭据写入 .env 文件:
chroma login
chroma db create <my_database_name>
chroma db connect <my_database_name> --env-file
这会将包含 CHROMA_API_KEY、CHROMA_TENANT 和 CHROMA_DATABASE 的 .env 文件写入当前目录。下面的代码示例从这些环境变量中读取。
TypeScript (Chroma Cloud):
import { CloudClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
const client = new CloudClient({
apiKey: process.env.CHROMA_API_KEY,
tenant: process.env.CHROMA_TENANT,
database: process.env.CHROMA_DATABASE,
});
const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
name: 'my_collection',
embeddingFunction,
});
// 添加文档
await collection.add({
ids: ['doc1', 'doc2'],
documents: ['First document text', 'Second document text'],
});
// 查询
const results = await collection.query({
queryTexts: ['search query'],
nResults: 5,
});
Python (Chroma Cloud):
import os
import chromadb
client = chromadb.CloudClient(
api_key=os.environ["CHROMA_API_KEY"],
tenant=os.environ["CHROMA_TENANT"],
database=os.environ["CHROMA_DATABASE"],
)
collection = client.get_or_create_collection(name="my_collection")
# 添加文档
collection.add(
ids=["doc1", "doc2"],
documents=["First document text", "Second document text"],
)
# 查询
results = collection.query(
query_texts=["search query"],
n_results=5,
)
Chroma 是一个数据库。一个 Chroma 数据库包含多个集合。一个集合包含多个文档。
与关系型数据库中的表不同,集合是在应用程序级别创建和销毁的。每个 Chroma 数据库可以拥有数百万个集合。可以为每个用户、团队或组织创建一个集合。在 Chroma 中,分区是集合,而不是通过某个键对表进行分区。
集合没有行,它们有文档,文档是要搜索的文本数据。当数据被创建或更新时,客户端将创建数据的嵌入。这是基于提供给客户端的嵌入函数在客户端完成的。为了创建嵌入,客户端将使用其配置通过嵌入函数调用定义的嵌入模型提供商。这可能在进程内发生,但绝大多数情况下是通过 HTTP 在第三方服务上发生。
可以通过文档元数据进一步分区或过滤数据。每个文档都有一个键/值对象形式的元数据。键是字符串,值可以是字符串、整数或布尔值。元数据上有多种运算符。
在查询时,查询文本使用集合定义的嵌入函数进行嵌入,然后与其余查询参数一起发送到 Chroma。然后,Chroma 将考虑任何查询参数(如元数据过滤器)来减少潜在的结果集,然后使用查询向量与被查询集合中的向量索引之间的距离算法搜索最近邻。
通过使用 Chroma 客户端上的 get_or_create_collection()(TypeScript 中为 getOrCreateCollection()),可以轻松处理集合,避免了繁琐的样板代码。
Chroma 可以作为本地进程运行,也可以在云端通过 Chroma Cloud 使用。
本地可以完成的所有操作都可以在云端完成,但并非云端可以完成的所有操作都能在本地完成。
对开发者体验最大的区别是 Schema() 和 Search() API,这些仅在 Chroma Cloud 上可用。
除此之外,唯一需要改变的是从 Chroma 包中导入的客户端,接口是相同的。
如果您使用 Cloud,您可能希望使用 Schema() 和 Search() API。
此外,如果用户想要使用 Cloud,询问他们想要使用哪种类型的搜索。仅密集嵌入,还是混合搜索。如果是混合搜索,您可能希望使用 SPLADE 作为稀疏嵌入策略。
在使用嵌入函数时,默认嵌入函数是可用的,但它通常不是最佳选择。推荐的选择是使用 Chroma Cloud Qwen。Typescript:npm install @chroma-core/chroma-cloud-qwen,Python:包含但需要 pip install httpx。
在 TypeScript 中,您需要为每个嵌入函数安装一个包,根据用户所说的内容安装正确的包。
请注意,Chroma 对 SPLADE 和 Qwen(通过 TypeScript 中的 @chroma-core/chroma-cloud-qwen)提供服务器端嵌入支持,所有其他嵌入函数都是外部的。
如果您需要关于 Chroma 的更详细信息,超出了本技能涵盖的范围,请获取 Chroma 的 llms.txt 以获取全面的文档:https://docs.trychroma.com/llms.txt
每周安装次数
64
代码仓库
GitHub 星标数
9
首次出现
2026年1月21日
安全审计
安装于
codex54
gemini-cli51
opencode50
github-copilot47
kimi-cli44
amp44
Deployment target : Local Chroma or Chroma Cloud?
Search type (Cloud only): Dense only, or hybrid search?
Embedding model : Which provider/model?
@chroma-core/default-embed (TypeScript) or built-in (Python)text-embedding-3-large is most popular, requires @chroma-core/openaiData structure : What are they indexing?
Ask first:
Proceed with sensible defaults:
getOrCreateCollection() / get_or_create_collection()CloudClient vs Client)get_or_create_collection() accepts either an embedding_function OR a schema, but not both. Use Schema when you need multiple indexes (hybrid search) or sparse embeddings; use embedding_function for simple dense-only search.To get started with Chroma Cloud, use the CLI to log in, create a database, and write your credentials to a .env file:
chroma login
chroma db create <my_database_name>
chroma db connect <my_database_name> --env-file
This writes a .env file with CHROMA_API_KEY, CHROMA_TENANT, and CHROMA_DATABASE to the current directory. The code examples below read from these environment variables.
TypeScript (Chroma Cloud):
import { CloudClient } from 'chromadb';
import { DefaultEmbeddingFunction } from '@chroma-core/default-embed';
const client = new CloudClient({
apiKey: process.env.CHROMA_API_KEY,
tenant: process.env.CHROMA_TENANT,
database: process.env.CHROMA_DATABASE,
});
const embeddingFunction = new DefaultEmbeddingFunction();
const collection = await client.getOrCreateCollection({
name: 'my_collection',
embeddingFunction,
});
// Add documents
await collection.add({
ids: ['doc1', 'doc2'],
documents: ['First document text', 'Second document text'],
});
// Query
const results = await collection.query({
queryTexts: ['search query'],
nResults: 5,
});
Python (Chroma Cloud):
import os
import chromadb
client = chromadb.CloudClient(
api_key=os.environ["CHROMA_API_KEY"],
tenant=os.environ["CHROMA_TENANT"],
database=os.environ["CHROMA_DATABASE"],
)
collection = client.get_or_create_collection(name="my_collection")
# Add documents
collection.add(
ids=["doc1", "doc2"],
documents=["First document text", "Second document text"],
)
# Query
results = collection.query(
query_texts=["search query"],
n_results=5,
)
Chroma is a database. A Chroma database contains collections. A collection contains documents.
Unlike tables in a relational database, collections are created and destroyed at the application level. Each Chroma database can have millions of collections. There may be a collection for each user, or team or organization. Rather than tables be partitioned by some key, the partition in Chroma is the collection.
Collections don't have rows, they have documents, the document is the text data that is to be searched. When data is created or updated, the client will create an embedding of the data. This is done on the client side based on the embedding function(s) provided to the client. To create the embedding the client will use its configuration to call out to the defined embedding model provider via the embedding function. This could happen in process, but overwhelmingly happens on a third party service over HTTP.
There are ways to further partition or filtering data with document metadata. Each document has a key/value object of metadata. keys are strings and values can be strings, ints or booleans. There are a variety of operators on the metadata.
During query time, the query text is embedded using the collection's defined embedding function and then is sent to Chroma with the rest of the query parameters. Chroma will then consider any query parameters like metadata filters to reduce the potential result set, then search for the nearest neighbors using a distance algorithm between the query vector and the index of vectors in the collection that is being queried.
Working with collections is made easy by using the get_or_create_collection() (getOrCreateCollection() in TypeScript) on the Chroma client, preventing annoying boilerplate code.
Chroma can be run locally as a process or can be used in the cloud with Chroma Cloud.
Everything that can be done locally can be done in the cloud, but not everything that can be done in the cloud can be done locally.
The biggest difference to the developer experience is the Schema() and Search() APIs, those are only available on Chroma Cloud.
Otherwise, the only thing that needs to change is the client that is imported from the Chroma package, the interface is the same.
If you're using cloud, you probably want to use the Schema() and Search() APIs.
Also, if the user wants to use cloud, ask them what type of search they want to use. Just dense embeddings, or hybrid. If hybrid, you probably want to use SPLADE as the sparse embedding strategy.
When working with embedding functions, the default embedding function is available, but it's often not the best option. The recommended option is to use Chroma Cloud Qwen. Typescript: npm install @chroma-core/chroma-cloud-qwen, python, included but needs pip install httpx.
In typescript, you need to install a package for each embedding function, install the correct one based on what the user says.
Note that Chroma has server side embedding support for SPLADE and Qwen (via @chroma-core/chroma-cloud-qwen in typescript), all other embedding functions would be external.
If you need more detailed information about Chroma beyond what's covered in this skill, fetch Chroma's llms.txt for comprehensive documentation: https://docs.trychroma.com/llms.txt
Weekly Installs
64
Repository
GitHub Stars
9
First Seen
Jan 21, 2026
Security Audits
Gen Agent Trust HubPassSocketPassSnykPass
Installed on
codex54
gemini-cli51
opencode50
github-copilot47
kimi-cli44
amp44
AI 代码实施计划编写技能 | 自动化开发任务分解与 TDD 流程规划工具
50,900 周安装
业务逻辑测试审计器:自动化检测框架库测试,提升测试代码质量
229 周安装
AI子代理创建指南:构建高效、可并行执行的专用助手 | 代理开发教程
231 周安装
CRA迁移Next.js指南:148条规则,从React Router到App Router完整迁移
231 周安装
Elasticsearch 审计日志配置指南:启用、管理与安全事件监控
239 周安装
Node.js依赖更新技能bump-deps:智能提示Major更新,自动应用Minor/Patch更新
229 周安装
Encore API 端点开发指南:TypeScript API 创建与配置教程
230 周安装