自解释代码指南：语义函数、实用函数与模型设计的最佳实践

self-documenting-code by theswerd/aicode

108 周安装量

6 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/theswerd/aicode --skill self-documenting-code

软件工程代码规范编程基础

🇨🇳中文介绍

代码应当具备自解释性

如何将逻辑拆分为函数，以及如何设计这些函数之间传递的数据结构，决定了代码库随时间推移的稳健程度。

语义函数

语义函数是任何代码库的构建基石。一个好的语义函数应尽可能精简，以优先确保其正确性。语义函数应接收完成其目标所需的所有输入，并直接返回所有必要的输出。语义函数可以封装其他语义函数，以描述期望的流程和用法；作为代码库的构建基石，如果存在定义良好且各处使用的复杂流程，应使用语义函数将其固化。

除非副作用是其明确目标，否则语义函数中通常应避免副作用，因为语义函数应当可以安全地复用，而无需理解其内部实现来了解其宣称的功能。如果逻辑复杂，且在大型流程中其作用不清晰，一个好的模式是将该流程分解为一系列自描述的语义函数。这些函数接收所需输入，返回下一步所需的数据，并且不做任何其他事情。好的语义函数示例范围很广，从 quadratic_formula() 到 retry_with_exponential_backoff_and_run_y_in_between<Y: func, X: Func>(x: X, y: Y)。即使这些函数不再被使用，未来审阅代码的人类和智能体也会欣赏这种信息的索引方式。

语义函数周围不应需要任何注释，代码本身就应该是对其功能的自我描述性定义。理想情况下，语义函数应该非常易于进行单元测试，因为一个好的语义函数就是一个定义良好的函数。

实用函数

实用函数应作为一系列语义函数和独特逻辑的包装器使用。它们是代码库中的复杂流程。在构建生产系统时，逻辑变得混乱是很自然的，实用函数就是对这些混乱进行组织的方式。它们通常不应在太多地方使用，如果使用过多，应考虑分解其中的明确逻辑并将其移入语义函数。例如 provision_new_workspace_for_github_repo(repo, user) 或 handle_user_signup_webhook()。测试实用函数属于集成测试的范畴，通常是在测试整个应用功能的上下文中完成。预计实用函数会随着时间的推移而完全改变，包括其内部实现和功能。为了帮助应对这一点，最好在它们上方添加文档注释。避免重述函数名或其显而易见的特性，而是注明意外情况，例如“当余额小于10时提前失败”，或纠正函数名可能引起的其他误解。作为文档注释的读者，请对其持保留态度，因为函数内部的编码人员可能忘记更新它们，当你认为它们可能不正确时，最好进行事实核查。

模型

你的数据结构应使错误状态不可能出现。如果一个模型允许在实践中永远不应同时存在的字段组合，那么这个模型就没有做好它的工作。每个可选字段都是代码库其余部分每次接触该数据时必须回答的一个问题，而每个松散类型的字段都是邀请调用者传递看似正确实则不然的内容。当模型强制执行正确性时，错误会在构造点浮现，而不是在某个不相关的流程深处，当假设最终崩溃时才被发现。模型的名称应足够精确，以便你可以查看任何字段并知道它是否属于该模型——如果名称没有告诉你，那么这个模型试图涵盖太多东西。当两个概念经常需要一起使用但又是独立的时，应将它们组合而不是合并——例如保持两个模型完整，而不是将工作区字段扁平化到用户中。像、和这样的好名称能明确告诉你哪些字段属于它。如果你在上看到一个字段，你就知道出了问题。

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

🇺🇸English

Code should be self documenting

How you split logic into functions and shape the data they pass around determines how well a codebase holds up over time.

Semantic Functions

Semantic functions are the building blocks of any codebase, a good semantic function should be as minimal as possible in order to prioritize correctness in it. A semantic function should take in all required inputs to complete its goal and return all necessary outputs directly. Semantic functions can wrap other semantic functions to describe desired flows and usage; as the building blocks of the codebase, if there are complex flows used everywhere that are well defined, use a semantic function to codify them.

Side effects are generally undesirable in semantic functions unless they are the explicit goal because semantic functions should be safe to re-use without understanding their internals for what they say they do. If logic is complicated and it's not clear what it does in a large flow, a good pattern is to break that flow up into a series of self describing semantic functions that take in what they need, return the data necessary for the next step, and don't do anything else. Examples of good semantic functions range from quadratic_formula() to retry_with_exponential_backoff_and_run_y_in_between<Y: func, X: Func>(x: X, y: Y). Even if these functions are never used again, future humans and agents going over the code will appreciate the indexing of information.

Semantic functions should not need any comments around them, the code itself should be a self describing definition of what it does. Semantic functions should ideally be extremely unit testable because a good semantic function is a well defined one.

Pragmatic Functions

Pragmatic functions should be used as wrappers around a series of semantic functions and unique logic. They are the complex processes of your codebase. When making production systems it's natural for the logic to get messy, pragmatic functions are the organization for these. These should generally not be used in more than a few places, if they are, consider breaking down the explicit logic and moving it into semantic functions. For example provision_new_workspace_for_github_repo(repo, user) or handle_user_signup_webhook(). Testing pragmatic functions falls into the realm of integration testing, and is often done within the context of testing whole app functionality. Pragmatic functions are expected to change completely over time, from their insides to what they do. To help with that, it's good to have doc comments above them. Avoid restating the function name or obvious traits about it, instead note unexpected things like "fails early on balance less than 10", or combatting other misconceptions coming from the function name. As a reader of doc comments take them with a grain of salt, coders working inside the function may have forgotten to update them, and it's good to fact check them when you think they might be incorrect.

Models

The shape of your data should make wrong states impossible. If a model allows a combination of fields that should never exist together in practice, the model isn't doing its job. Every optional field is a question the rest of the codebase has to answer every time it touches that data, and every loosely typed field is an invitation for callers to pass something that looks right but isn't. When models enforce correctness, bugs surface at the point of construction rather than deep inside some unrelated flow where the assumptions finally collapse. A model's name should be precise enough that you can look at any field and know whether it belongs — if the name doesn't tell you, the model is trying to be too many things. When two concepts are often needed together but are independent, compose them rather than merging them — e.g. UserAndWorkspace { user: User, workspace: Workspace } keeps both models intact instead of flattening workspace fields into the user. Good names like UnverifiedEmail, PendingInvite, and BillingAddress tell you exactly what fields belong. If you see a phone_number field on BillingAddress, you know something went wrong.

Values with identical shapes can represent completely different domain concepts: { id: "123" } might be a DocumentReference in one place and a MessagePointer in another, and if your functions just accept { id: String }, the code will accept either one without complaint. Brand types solve this by wrapping a primitive in a distinct type so the compiler treats them as separate: DocumentId(UUID) instead of a bare UUID. With branding in place, accidentally swapping two IDs becomes a syntax error instead of a silent bug that surfaces three layers deep.

Where Things Break

Breaks commonly happen when a semantic function morphs into a pragmatic function for ease, and then other places in the codebase that rely on it end up doing things they didn't intend. To solve this, be explicit when creating a function by naming it instead of by what it does, but by where it's used. The nature of their names should make it clear to other programmers in their names that their behavior is not tightly defined and should not be relied on for the internals to do an exact task, and make debugging regressions from them easier.

Models break the same way but slower. They start focused, then someone adds "just one more" optional field because it's easier than creating a new model, and then someone else does the same, and eventually the model is a loose bag of half-related data where every consumer has to guess which fields are actually set and why. The name stops describing what the data is, the fields stop cohering around a single concept, and every new feature that touches the model has to navigate states it was never designed to represent. When a model's fields no longer cohere around its name, that's the signal to split it into the distinct things it's been coupling together.

Weekly Installs

Repository

theswerd/aicode

GitHub Stars

First Seen

8 days ago

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

kimi-cli94

gemini-cli94

amp94

cline94

github-copilot94

codex94

自解释代码指南：语义函数、实用函数与模型设计的最佳实践

🇨🇳中文介绍

相关 Skills

🇺🇸English

最新 Skills