feat: 添加 Crowdin bootstrap 工作流以支持目标翻译的初始化上传

This commit is contained in:
春河晴
2026-03-13 04:54:47 +09:00
parent 67b559200d
commit 89bd4ba13a
4 changed files with 268 additions and 98 deletions

View File

@@ -2,42 +2,87 @@
## Scope
This document summarizes the current GitHub-side localization workflow in this repository.
It focuses on how GitHub Actions, branch conventions, pull requests, and validation gates work together around Crowdin.
This document defines the repository-side localization workflow for Option B.
It focuses on GitHub Actions, branch conventions, pull requests, and validation gates around Crowdin.
This is intentionally a repository-specific operational report, not a generic Crowdin or GitHub Actions tutorial.
This is intentionally repository-specific operational guidance, not a generic Crowdin or GitHub Actions tutorial.
## Current Branch Model
External prerequisite:
- Source branches covered by the localization workflow:
- the Crowdin project and GitHub Actions secrets must already be configured
- if Crowdin's native GitHub integration exists on the Crowdin side, it must not be used as a second write-back path for this repository
## Repository Policy
- `zh-CN` is the only source language of truth in the repository
- target-locale files committed in the repository are synchronization artifacts and reviewable outputs, not the normal long-term editing surface
- existing committed target translations must be bootstrapped into Crowdin once before steady-state sync is trusted
- after bootstrap, Crowdin is the normal editing surface for target translations
- GitHub Actions is the only allowed GitHub-side synchronization mechanism between this repository and Crowdin
- translations return through `l10n_*` pull requests, not direct pushes into `main` or `r-dev`
## Branch Model
- source branches covered by the localization workflow:
- `main`
- `r-dev`
- Crowdin return branches:
- `l10n_main`
- `l10n_r-dev`
- Merge strategy:
- merge strategy:
- translations do not go directly into `main` or `r-dev`
- Crowdin updates return through pull requests and can be reviewed before merge
- translations are reviewed through pull requests before merge
## Source of Truth
## Source of Truth and Trigger Surface
- Source locale for JSON translations: `locales/zh-CN/*.json`
- Source locale for prompt templates: `prompts/zh-CN/**/*.prompt`
- Current prompt template extension in the repository: `.prompt` only
- source locale for JSON translations: `locales/zh-CN/*.json`
- source locale for prompt templates: `prompts/zh-CN/**/*.prompt`
- current prompt template extension in the repository: `.prompt`
GitHub Actions currently treat `zh-CN` assets as the source-language side of the workflow.
Translated target assets are not used to trigger source uploads back to Crowdin.
Normal push-triggered source uploads remain strictly source-driven.
Translated target assets are not part of the steady-state upload trigger set.
## Workflows Involved
### 1. `crowdin-sync.yml`
### 1. `crowdin-bootstrap.yml`
Role:
- provides a manual bootstrap path for existing committed target translations
- seeds Crowdin from the repository's current target-locale state
- keeps this exceptional upload path separate from normal source-driven sync
Triggers:
- manual dispatch only
Inputs:
- `base_branch`: `main` or `r-dev`
- `confirm_bootstrap`: explicit confirmation string
Behavior:
- checks out the selected repository branch
- uploads sources and committed target translations to Crowdin
- does not download translations
- does not create or update `l10n_*` pull requests
Guardrail:
- this workflow is intentionally one-time or exceptional
- maintainers must not treat it as a continuous GitHub-to-Crowdin target-translation sync path
### 2. `crowdin-sync.yml`
Role:
- uploads source-language assets to Crowdin
- downloads currently available translations from Crowdin when the workflow runs
- creates or updates localization pull requests back to the matching base branch
Current triggers:
Triggers:
- manual dispatch
- scheduled sync every 6 hours: `17 */6 * * *` UTC
- push to `main` or `r-dev` when one of these paths changes:
@@ -45,13 +90,15 @@ Current triggers:
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
Current branch behavior:
Branch behavior:
- push-triggered runs sync the current Git branch and use a matching localization branch name:
- `main -> l10n_main -> PR into main`
- `r-dev -> l10n_r-dev -> PR into r-dev`
- scheduled runs explicitly cover both `main` and `r-dev`
Current permissions and credentials:
Permissions and credentials:
- `contents: write`
- `pull-requests: write`
- `GITHUB_TOKEN`
@@ -59,16 +106,19 @@ Current permissions and credentials:
- `CROWDIN_PERSONAL_TOKEN`
Important boundary:
- the workflow keeps the PR-based return flow intact
- it does not directly push translated content into `main` or `r-dev`
### 2. `i18n-validate.yml`
- the steady-state workflow keeps the PR-based return flow intact
- normal runs do not upload direct GitHub edits to target-locale files back into Crowdin
### 3. `i18n-validate.yml`
Role:
- runs repository-side localization validation
- blocks structurally invalid or policy-breaking localization changes
Current triggers:
Triggers:
- pull requests that touch:
- `locales/**/*.json`
- `prompts/**/*.prompt`
@@ -78,7 +128,8 @@ Current triggers:
- `src/prompt/prompt_manager.py`
- pushes to `main` or `r-dev` for the same path set
Current validation scope:
Validation scope:
- JSON locale key alignment against `zh-CN`
- placeholder consistency
- plural structure consistency
@@ -87,133 +138,154 @@ Current validation scope:
- rejection of non-`zh-CN` entries that directly preserve Chinese source text
Prompt behavior note:
- missing target prompt files currently produce warnings, not hard failures
- runtime still falls back to `zh-CN` prompt templates when localized prompt files are absent
### 3. `precheck.yml`
### 4. `precheck.yml`
Role:
- checks whether a pull request conflicts with its real target branch
- preserves the existing conflict-label behavior
Current triggers:
- all pull requests
Behavior:
Current behavior:
- checks out the PR head commit
- fetches the actual PR base branch from `github.event.pull_request.base.ref`
- performs a merge simulation against that real base branch
- marks the PR as conflicted only if the merge simulation produces unmerged files
This means:
- feature branches into `main` are checked against `main`
- feature branches into `r-dev` are checked against `r-dev`
- `l10n_main` PRs are checked against `main`
- `l10n_r-dev` PRs are checked against `r-dev`
### 4. `ruff-pr.yml`
### 5. `ruff-pr.yml`
Role:
- runs Ruff lint and format checks for pull requests that are relevant to Python code quality
Current triggers:
- pull requests that touch:
- `*.py`
- `**/*.py`
- `pyproject.toml`
- `ruff.toml`
- `.ruff.toml`
- `setup.cfg`
- `tox.ini`
- `.pre-commit-config.yaml`
Effect:
- translation-only localization pull requests do not run Ruff by default
- Python or Ruff-related pull requests still run the existing Ruff checks
## End-to-End GitHub Flow
### A. Source-language update on `r-dev` or `main`
### A. One-time bootstrap of existing target translations
1. A maintainer chooses `main` or `r-dev` as the branch whose committed target translations should seed Crowdin.
2. The maintainer manually runs `crowdin-bootstrap.yml` with explicit confirmation.
3. The workflow uploads the selected branch's current sources and committed target translations to Crowdin.
4. No `l10n_*` pull request is created by this bootstrap workflow.
5. After bootstrap, target-language maintenance should move to Crowdin as the normal editing surface.
### B. Normal source-language update on `main` or `r-dev`
1. A source-language change is pushed to `main` or `r-dev`.
2. `crowdin-sync.yml` uploads source assets to Crowdin.
2. `crowdin-sync.yml` uploads `zh-CN` source assets to Crowdin.
3. The same workflow may also download any translations currently available in Crowdin when that workflow run executes.
4. A localization PR is opened or updated:
4. A localization pull request is opened or updated:
- `l10n_main -> main`
- `l10n_r-dev -> r-dev`
### B. Localization PR created by Crowdin branch
### C. Translation return flow
1. A localization PR is opened from `l10n_${branch}` into its matching base branch.
2. `precheck.yml` validates conflicts against the real PR base branch.
3. `i18n-validate.yml` validates localization structure and repository-specific locale-content policy.
4. `ruff-pr.yml` does not run if the PR only changes translation assets.
5. Maintainers review and merge the localization PR in the normal PR flow.
1. Translators work in Crowdin.
2. Repository updates do not appear in `main` or `r-dev` immediately at approval time.
3. Repository write-back happens when `crowdin-sync.yml` runs.
4. GitHub updates or creates `l10n_${branch}` pull requests.
5. Maintainers review and merge the localization pull request in the normal PR flow.
### C. Scheduled sync
### D. Scheduled sync
1. Every 6 hours, GitHub Actions runs a scheduled localization sync.
2. The workflow explicitly processes both `main` and `r-dev`.
3. If Crowdin currently has downloadable translation updates, GitHub updates or creates the corresponding `l10n_` PRs.
3. If Crowdin currently has downloadable translation updates, GitHub updates or creates the corresponding `l10n_` pull requests.
## How the Current Setup Avoids Sync Loops
## How the Setup Avoids Sync Loops
- Crowdin source uploads are triggered only from source-language assets:
- source uploads are triggered only from:
- `crowdin.yml`
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
- translated target files do not trigger another source upload cycle
- translated target files do not trigger another steady-state upload cycle
- the bootstrap path is manual and confirmation-gated
- translations return through `l10n_` branches and PRs instead of direct pushes to base branches
- translation-only PRs do not trigger Ruff, which reduces unnecessary CI noise without weakening Python quality gates
## Current Locale Content Policy on GitHub
## GitHub-Usable Maintainer Operations
This section describes repository-specific validation policy layered on top of the baseline Crowdin PR workflow.
It is not a default Crowdin rule.
### Trigger the bootstrap path
The repository now enforces a stricter GitHub-side policy for committed target locale files:
GitHub UI:
- `zh-CN` remains the source language
- non-`zh-CN` locale files must not carry over Chinese source text unchanged
- English locale files must not retain Chinese characters
- Actions -> `Crowdin Bootstrap Target Translations`
- choose `main` or `r-dev`
- set `confirm_bootstrap` to `yes-bootstrap-current-target-translations`
This policy is enforced by `scripts/i18n_validate.py` and therefore applies to localization pull requests on GitHub before merge.
GitHub CLI:
## What Was Verified Against Repository Reality
```bash
gh workflow run crowdin-bootstrap.yml \
--ref r-dev \
-f base_branch=r-dev \
-f confirm_bootstrap=yes-bootstrap-current-target-translations
```
- The repository currently uses `.prompt` as the only i18n-relevant prompt template extension.
- `i18n-validate.yml` already watches `prompts/**/*.prompt`, so no broader prompt-file trigger was needed.
- The only committed non-`zh-CN` locale directory currently present in the repository is `locales/en-US`.
- Chinese text previously found in `locales/en-US/startup.json` has been removed.
Use this only when seeding Crowdin from already-committed target translations, or in another exceptional recovery scenario.
## Practical Maintainer Expectations
### Trigger a normal manual sync
- A translation-only PR should normally trigger:
- precheck
- i18n validation
- A translation-only PR should normally not trigger:
- Ruff PR checks
- A Python code PR should normally trigger:
- precheck
- Ruff PR checks
- i18n validation if it touches i18n-related code or locale/prompt assets
GitHub UI:
## Open Discussion Topics
- Actions -> `Crowdin Sync`
- run the workflow on `main` or `r-dev`
These are not current defects, but they may still be useful discussion topics for the team:
GitHub CLI:
- whether scheduled source uploads should remain enabled together with scheduled translation downloads
- whether GitHub-side workflow linting should be added explicitly in the future
- whether additional locale-specific content rules should be introduced once more target locales are committed in the repository
```bash
gh workflow run crowdin-sync.yml --ref r-dev
```
### Inspect workflow runs
```bash
gh run list --workflow crowdin-sync.yml --limit 5
gh run list --workflow crowdin-bootstrap.yml --limit 5
```
### Inspect resulting localization pull requests
```bash
gh pr list --head l10n_main
gh pr list --head l10n_r-dev
```
### Verify that GitHub Actions is the repository write-back path
- confirm there is a successful `crowdin-sync.yml` run corresponding to the latest `l10n_*` PR update
- confirm translated content returned through `l10n_main` or `l10n_r-dev`, not a direct push into `main` or `r-dev`
- do not rely on a separate Crowdin native GitHub integration PR or branch flow for this repository
## Guardrails That Remain Intact
- localization PRs are still checked against their real base branch
- repository-side localization validation still runs where expected
- translation-only PRs still avoid unnecessary Ruff noise by default
- Python-impacting PRs still run Python quality gates where appropriate
- the steady-state `zh-CN` source-trigger model remains unchanged
## Bottom Line
The current GitHub-side localization workflow is now centered on a stable model:
The GitHub-side localization workflow now supports the intended Option B model:
- `zh-CN` is the source language
- `main` and `r-dev` are the source branches covered by Crowdin sync
- Crowdin returns through `l10n_${branch}` pull requests
- PR conflict checks now use the real base branch
- translation-only PRs no longer run Ruff by default
- GitHub-side i18n validation now also protects against source-language leakage in committed target locales
This keeps the existing Crowdin branch strategy intact while making pull request validation more accurate and less noisy.
- `zh-CN` remains the only repository source language
- existing committed target translations can be bootstrapped into Crowdin once through a manual workflow
- steady-state sync remains source-driven and GitHub Actions-only
- translated content still returns through `l10n_${branch}` pull requests
- existing PR validation and reduced-noise translation PR behavior remain intact

View File

@@ -2,6 +2,15 @@
MaiBot 现在使用 `JSON + Crowdin + Babel` 的国际化方案,不依赖 gettext 的 `.po/.mo` 运行时。
## 仓库翻译策略
- `zh-CN` 是仓库内唯一的 source language也是唯一的 GitHub 侧 source of truth。
-`zh-CN` 的目标语言文件是同步产物和可评审输出,不是常规长期编辑面。
- 仓库中已经提交过的目标语言文件,需要先通过一次 bootstrap 同步进 Crowdin避免 Crowdin 把这些历史翻译当成“未翻译”并用 source 文本导出覆盖。
- bootstrap 完成后,目标语言的常规维护应在 Crowdin 中完成,而不是直接在 GitHub 中持续编辑。
- GitHub Actions 是仓库与 Crowdin 之间唯一允许的同步方式;不要把 Crowdin 的原生 GitHub integration 当作第二条回写路径。
- 翻译回流仍然通过 `l10n_*` pull request 完成,不直接写回 `main``r-dev`
## 目录结构
翻译文件位于 `locales/<locale>/*.json`,当前默认语言是 `zh-CN`
@@ -77,12 +86,14 @@ Prompt 加载规则:
- 全部小写
- 不要把中文原文直接当 key
## 新增翻译的步骤
## 日常翻译流程
1. 先在 `locales/zh-CN/*.json` 添加 source 文案
2. `locales/en-US/*.json` 中补上同名 key
3. 在代码中用 `t()``tn()` 替换硬编码字符串
4. 运行 `python scripts/i18n_validate.py` 校验结构
1. 先在 `locales/zh-CN/*.json` `prompts/zh-CN/**/*.prompt` 添加或修改 source 内容
2.代码中用 `t()` / `tn()` / `load_prompt()` 替换硬编码字符串
3. 运行 `python scripts/i18n_validate.py` 校验结构
4. 把 source 变更推送到 `main``r-dev`,或手动触发 [`crowdin-sync.yml`](../.github/workflows/crowdin-sync.yml)
5. 目标语言翻译在 Crowdin 中完成。
6. GitHub Actions 下载当时 Crowdin 中可用的翻译结果,并通过 `l10n_main` / `l10n_r-dev` pull request 回流到仓库。
对于非 `zh-CN` 的目标 locale
@@ -90,6 +101,15 @@ Prompt 加载规则:
- 不要手工把中文 source 文案直接复制进目标语言文件后提交。
- 英文 locale 文件中不应保留中文字符;这类残留会被校验脚本拦截。
### 什么时候可以直接改目标语言文件
以下场景才适合直接在 GitHub 中改非 `zh-CN` 文件:
- 需要把仓库里已经存在的历史目标语言文件一次性 bootstrap 到 Crowdin。
- 需要做紧急修复,而且你确认后续会把同样的修改补回 Crowdin避免下一次同步被覆盖。
除了上面这些例外,不要把目标语言文件当作常规编辑入口。
## 校验脚本
运行:
@@ -129,17 +149,30 @@ python scripts/i18n_extract_candidates.py
项目根目录的 [`crowdin.yml`](../crowdin.yml) 使用 `locales/zh-CN/*.json` 作为 source。
现在也会把 `prompts/zh-CN/**/*.prompt` 作为单文件 Prompt 模板 source 上传到 Crowdin。
GitHub Actions 中的 [`crowdin-sync.yml`](../.github/workflows/crowdin-sync.yml) 会在 workflow 运行时上传 source并下载当时 Crowdin 中可用的翻译结果。
GitHub Actions 中的 [`crowdin-sync.yml`](../.github/workflows/crowdin-sync.yml) 是日常稳态同步入口:
- push 到 `main` / `r-dev` 时,只有 `zh-CN` source 资产和 `crowdin.yml` 会触发正常上传。
- workflow 运行时会上传 source并下载当时 Crowdin 中可用的翻译结果。
- 下载结果通过 `l10n_*` pull request 回流,而不是直接写回 `main` / `r-dev`
[`crowdin-bootstrap.yml`](../.github/workflows/crowdin-bootstrap.yml) 是一次性或例外场景使用的 bootstrap 入口:
- 只能手动触发。
- 会把仓库当前已提交的目标语言文件上传到 Crowdin用来保留历史翻译。
- 不会作为日常 workflow 持续上传 GitHub 中的目标语言改动。
常用命令:
```bash
crowdin upload sources
crowdin upload translations
crowdin download translations
python scripts/i18n_validate.py
gh workflow run crowdin-sync.yml --ref r-dev
gh workflow run crowdin-bootstrap.yml --ref r-dev -f base_branch=r-dev -f confirm_bootstrap=yes-bootstrap-current-target-translations
gh run list --workflow crowdin-sync.yml --limit 5
gh pr list --head l10n_r-dev
```
更完整的 GitHub 侧操作说明见 [`docs/github-actions-crowdin-workflow-report.md`](./github-actions-crowdin-workflow-report.md)。
## 当前迁移范围
这一批已经覆盖: