feat: 更新 Crowdin 工作流以支持 dashboard WebUI 的本地化文件,添加相关验证逻辑

This commit is contained in:
春河晴
2026-03-13 20:09:21 +09:00
parent 5f5ff4ce8e
commit 5da82c4e24
8 changed files with 283 additions and 15 deletions

View File

@@ -12,6 +12,7 @@ on:
- "crowdin.yml"
- "locales/zh-CN/*.json"
- "prompts/zh-CN/**/*.prompt"
- "dashboard/src/i18n/locales/zh.json"
permissions:
contents: write

View File

@@ -5,6 +5,8 @@ on:
paths:
- "locales/**/*.json"
- "prompts/**/*.prompt"
- "dashboard/src/i18n/index.ts"
- "dashboard/src/i18n/locales/*.json"
- "scripts/i18n_validate.py"
- "src/common/i18n/**/*.py"
- "src/common/prompt_i18n.py"
@@ -16,6 +18,8 @@ on:
paths:
- "locales/**/*.json"
- "prompts/**/*.prompt"
- "dashboard/src/i18n/index.ts"
- "dashboard/src/i18n/locales/*.json"
- "scripts/i18n_validate.py"
- "src/common/i18n/**/*.py"
- "src/common/prompt_i18n.py"

View File

@@ -7,9 +7,20 @@ preserve_hierarchy: true
export_languages:
- en-US
- ja
- ko
files:
- source: /locales/zh-CN/*.json
translation: /locales/%locale%/%original_file_name%
excluded_target_languages:
- ko
- source: /prompts/zh-CN/**/*.prompt
translation: /prompts/%locale%/**/%original_file_name%
excluded_target_languages:
- ko
- source: /dashboard/src/i18n/locales/zh.json
translation: /dashboard/src/i18n/locales/%two_letters_code%.json
languages_mapping:
two_letters_code:
en-US: en

View File

@@ -41,6 +41,7 @@ Only source-language assets should trigger source uploads to Crowdin.
Current source-language assets:
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
- `dashboard/src/i18n/locales/zh.json`
- `crowdin.yml`
### Prompt template extension
@@ -162,6 +163,9 @@ This workflow must remain the repository-side structural validation layer for lo
#### Required validation role
It should validate localization assets and i18n-relevant code changes.
That includes both:
- backend locale JSON under `locales/`
- dashboard locale JSON under `dashboard/src/i18n/locales/`
#### Prompt trigger scope
It must cover the actual prompt template extension used in the repository.

View File

@@ -37,7 +37,9 @@ External prerequisite:
- source locale for JSON translations: `locales/zh-CN/*.json`
- source locale for prompt templates: `prompts/zh-CN/**/*.prompt`
- source locale for dashboard WebUI translations: `dashboard/src/i18n/locales/zh.json`
- current prompt template extension in the repository: `.prompt`
- dashboard WebUI keeps short runtime locale filenames (`zh`, `en`, `ja`, `ko`) in Git, but `dashboard/src/i18n/locales/zh.json` is still the repository-side `zh-CN` source asset for that file group
Normal push-triggered source uploads remain strictly source-driven.
Translated target assets are not part of the steady-state upload trigger set.
@@ -94,6 +96,7 @@ Triggers:
- `crowdin.yml`
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
- `dashboard/src/i18n/locales/zh.json`
Branch behavior:
@@ -127,6 +130,8 @@ Triggers:
- pull requests that touch:
- `locales/**/*.json`
- `prompts/**/*.prompt`
- `dashboard/src/i18n/index.ts`
- `dashboard/src/i18n/locales/*.json`
- `scripts/i18n_validate.py`
- `src/common/i18n/**/*.py`
- `src/common/prompt_i18n.py`
@@ -136,7 +141,9 @@ Triggers:
Validation scope:
- JSON locale key alignment against `zh-CN`
- dashboard nested JSON locale key alignment against `dashboard/src/i18n/locales/zh.json`
- placeholder consistency
- dashboard i18next interpolation placeholder consistency
- plural structure consistency
- prompt placeholder consistency
- English locale protection against Chinese source-language leakage
@@ -192,7 +199,7 @@ Effect:
### B. Normal source-language update on `main` or `r-dev`
1. A source-language change is pushed to `main` or `r-dev`.
2. `crowdin-sync.yml` uploads `zh-CN` source assets to Crowdin.
2. `crowdin-sync.yml` uploads source-language assets to Crowdin, including the dashboard WebUI source file `dashboard/src/i18n/locales/zh.json`.
3. The same workflow may also download any translations currently available in Crowdin when that workflow run executes.
4. A localization pull request is opened or updated:
- `l10n_main -> main`
@@ -218,6 +225,7 @@ Effect:
- `crowdin.yml`
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
- `dashboard/src/i18n/locales/zh.json`
- translated target files do not trigger another steady-state upload cycle
- the bootstrap path is manual and confirmation-gated
- translations return through `l10n_` branches and PRs instead of direct pushes to base branches

View File

@@ -37,6 +37,22 @@ prompts/
- 不要把 `zh-CN` 原文整批复制到 `prompts/en-US/` 后直接提交。
- 目标语言 prompt 文件应该由 Crowdin 下载生成;在本地还没有目标文件时,运行时会自动回退到 `zh-CN`
Dashboard WebUI 的 locale 文件位于 `dashboard/src/i18n/locales/*.json`
```text
dashboard/src/i18n/locales/
zh.json
en.json
ja.json
ko.json
```
注意:
- `dashboard/src/i18n/locales/zh.json` 是 dashboard 在 Git 中的 source of truth。
- dashboard 运行时继续使用 `zh` / `en` / `ja` / `ko` 这组短 locale code但 Crowdin 侧仍把 `zh.json` 当作仓库里的 `zh-CN` source 资产来同步。
- `en.json` / `ja.json` / `ko.json` 是同步产物和可评审输出,不是常规长期手工编辑面。
## 在代码中使用
统一从 [`src/common/i18n/__init__.py`](../src/common/i18n/__init__.py) 导入:
@@ -88,7 +104,7 @@ Prompt 加载规则:
## 日常翻译流程
1. 先在 `locales/zh-CN/*.json``prompts/zh-CN/**/*.prompt` 添加或修改 source 内容。
1. 先在 `locales/zh-CN/*.json``prompts/zh-CN/**/*.prompt``dashboard/src/i18n/locales/zh.json` 添加或修改 source 内容。
2. 在代码中用 `t()` / `tn()` / `load_prompt()` 替换硬编码字符串。
3. 运行 `python scripts/i18n_validate.py` 校验结构。
4. 把 source 变更推送到 `main``r-dev`,或手动触发 [`crowdin-sync.yml`](../.github/workflows/crowdin-sync.yml)。
@@ -100,6 +116,7 @@ Prompt 加载规则:
- 下面这两条是本仓库的 repository-specific 校验策略,不是 Crowdin 默认行为。
- 不要手工把中文 source 文案直接复制进目标语言文件后提交。
- 英文 locale 文件中不应保留中文字符;这类残留会被校验脚本拦截。
- `dashboard/src/i18n/locales/en.json` / `ja.json` / `ko.json` 也遵循同样规则。
### 什么时候可以直接改目标语言文件
@@ -124,7 +141,9 @@ python scripts/i18n_validate.py
- 是否存在重复 key
- 是否存在空字符串 key
- 各语言 key 集合是否与 `zh-CN` 对齐
- dashboard 嵌套 JSON 的 key 集合是否与 `dashboard/src/i18n/locales/zh.json` 对齐
- 占位符集合是否一致
- dashboard i18next `{{placeholder}}` 占位符集合是否一致
- plural 结构是否一致
-`zh-CN` locale 是否直接保留了包含中文字符的 source 文案
- prompt 模板已存在时,其占位符集合必须与 `prompts/zh-CN/` 对齐
@@ -146,12 +165,20 @@ python scripts/i18n_extract_candidates.py
## Crowdin
项目根目录的 [`crowdin.yml`](../crowdin.yml) 使用 `locales/zh-CN/*.json` 作为 source
现在也会把 `prompts/zh-CN/**/*.prompt` 作为单文件 Prompt 模板 source 上传到 Crowdin。
项目根目录的 [`crowdin.yml`](../crowdin.yml) 现在会上传三类 source
- `locales/zh-CN/*.json`
- `prompts/zh-CN/**/*.prompt`
- `dashboard/src/i18n/locales/zh.json`
对于 dashboard
- Crowdin 下载结果会回写到 `dashboard/src/i18n/locales/en.json``ja.json``ko.json`
- 其中英文文件名保持为 `en.json`,但在 Crowdin 配置里仍映射到 `en-US`
GitHub Actions 中的 [`crowdin-sync.yml`](../.github/workflows/crowdin-sync.yml) 是日常稳态同步入口:
- push 到 `main` / `r-dev` 时,只有 `zh-CN` source 资产`crowdin.yml` 会触发正常上传。
- push 到 `main` / `r-dev` 时,只有 source 资产(包括 dashboard 的 `zh.json` source 文件)`crowdin.yml` 会触发正常上传。
- workflow 运行时会上传 source并下载当时 Crowdin 中可用的翻译结果。
- 下载结果通过 `l10n_*` pull request 回流,而不是直接写回 `main` / `r-dev`

View File

@@ -19,6 +19,11 @@ def write_locale_file(locales_root: Path, locale: str, file_name: str, payload:
(locale_dir / file_name).write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")
def write_dashboard_locale_file(locales_root: Path, locale: str, payload: dict[str, object]) -> None:
locales_root.mkdir(parents=True, exist_ok=True)
(locales_root / f"{locale}.json").write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")
def test_validate_json_locales_rejects_han_characters_in_english_locale(tmp_path: Path) -> None:
locales_root = tmp_path / "locales"
write_locale_file(locales_root, "zh-CN", "core.json", {"consent.prompt": '输入"同意"继续'})
@@ -68,3 +73,38 @@ def test_validate_json_locales_avoids_false_positive_when_plural_categories_do_n
assert any("tasks.cancelled" in error and "plural category 不一致" in error for error in errors)
assert not any("tasks.cancelled" in error and "直接保留了包含中文字符的 source 文案" in error for error in errors)
def test_validate_dashboard_json_locales_rejects_han_characters_in_english_locale(tmp_path: Path) -> None:
locales_root = tmp_path / "dashboard-locales"
write_dashboard_locale_file(locales_root, "zh", {"common": {"greeting": "你好,世界"}})
write_dashboard_locale_file(locales_root, "en", {"common": {"greeting": "Hello 同意"}})
errors = I18N_VALIDATE.validate_dashboard_json_locales(locales_root)
assert any("dashboard:en" in error and "common.greeting" in error and "仍包含中文字符" in error for error in errors)
def test_validate_dashboard_json_locales_rejects_untranslated_han_source_in_other_target_locales(
tmp_path: Path,
) -> None:
locales_root = tmp_path / "dashboard-locales"
write_dashboard_locale_file(locales_root, "zh", {"common": {"greeting": "你好,世界"}})
write_dashboard_locale_file(locales_root, "ja", {"common": {"greeting": "你好,世界"}})
errors = I18N_VALIDATE.validate_dashboard_json_locales(locales_root)
assert any(
"dashboard:ja" in error and "common.greeting" in error and "直接保留了包含中文字符的 source 文案" in error
for error in errors
)
def test_validate_dashboard_json_locales_rejects_i18next_placeholder_drift(tmp_path: Path) -> None:
locales_root = tmp_path / "dashboard-locales"
write_dashboard_locale_file(locales_root, "zh", {"status": {"checkingDesc": "等待服务恢复... ({{current}}/{{max}})"}})
write_dashboard_locale_file(locales_root, "ko", {"status": {"checkingDesc": "서비스 복구 대기 중... ({{current}}/{{limit}})"}})
errors = I18N_VALIDATE.validate_dashboard_json_locales(locales_root)
assert any("dashboard:ko" in error and "status.checkingDesc" in error and "占位符集合与 source 不一致" in error for error in errors)

View File

@@ -1,7 +1,9 @@
from __future__ import annotations
from pathlib import Path
from typing import Callable
import json
import re
import sys
@@ -9,12 +11,19 @@ PROJECT_ROOT = Path(__file__).resolve().parents[1]
if str(PROJECT_ROOT) not in sys.path:
sys.path.insert(0, str(PROJECT_ROOT))
from src.common.i18n.exceptions import ( # noqa: E402
DuplicateTranslationKeyError,
InvalidTranslationFileError,
LocaleNotFoundError,
)
from src.common.i18n.loaders import ( # noqa: E402
DEFAULT_LOCALE,
PLURAL_CATEGORIES,
TranslationValue,
discover_locales,
get_locales_root,
load_locale_catalog,
validate_translation_value,
)
from src.common.i18n.loaders import extract_placeholders # noqa: E402
from src.common.prompt_i18n import ( # noqa: E402
@@ -25,12 +34,22 @@ from src.common.prompt_i18n import ( # noqa: E402
)
HAN_CHARACTER_PATTERN = re.compile(r"[\u3400-\u4DBF\u4E00-\u9FFF\uF900-\uFAFF]")
I18NEXT_PLACEHOLDER_PATTERN = re.compile(r"\{\{\s*([^\s,}]+)(?:\s*,[^}]*)?\s*\}\}")
DASHBOARD_DEFAULT_LOCALE = "zh"
def contains_han_characters(text: str) -> bool:
return HAN_CHARACTER_PATTERN.search(text) is not None
def extract_i18next_placeholders(template: str) -> set[str]:
placeholders: set[str] = set()
for match in I18NEXT_PLACEHOLDER_PATTERN.finditer(template):
placeholder_name = match.group(1)
placeholders.add(placeholder_name.split(".", maxsplit=1)[0].split("[", maxsplit=1)[0])
return placeholders
def iter_translation_strings(value: TranslationValue) -> list[str]:
if isinstance(value, str):
return [value]
@@ -60,7 +79,9 @@ def validate_locale_content(
target_value: TranslationValue,
locale: str,
errors: list[str],
locale_label: str | None = None,
) -> None:
resolved_locale_label = locale_label or locale
target_texts = iter_translation_strings(target_value)
if any(
@@ -68,11 +89,11 @@ def validate_locale_content(
for source_text, target_text in iter_shared_translation_strings(source_value, target_value)
):
errors.append(
f"[{locale}] key '{key}' 直接保留了包含中文字符的 source 文案(仓库级校验策略),请提供目标语言翻译"
f"[{resolved_locale_label}] key '{key}' 直接保留了包含中文字符的 source 文案(仓库级校验策略),请提供目标语言翻译"
)
if locale_requires_latin_only_validation(locale) and any(contains_han_characters(text) for text in target_texts):
errors.append(f"[{locale}] key '{key}' 仍包含中文字符,请移除源语言残留后再提交")
errors.append(f"[{resolved_locale_label}] key '{key}' 仍包含中文字符,请移除源语言残留后再提交")
def validate_translation_pair(
@@ -81,32 +102,183 @@ def validate_translation_pair(
target_value: TranslationValue,
locale: str,
errors: list[str],
placeholder_extractor: Callable[[str], set[str]] = extract_placeholders,
locale_label: str | None = None,
) -> None:
resolved_locale_label = locale_label or locale
if isinstance(source_value, str):
if not isinstance(target_value, str):
errors.append(f"[{locale}] key '{key}' 与 source 的类型不一致source=string, target=plural")
errors.append(
f"[{resolved_locale_label}] key '{key}' 与 source 的类型不一致source=string, target=plural"
)
return
if extract_placeholders(source_value) != extract_placeholders(target_value):
errors.append(f"[{locale}] key '{key}' 的占位符集合与 source 不一致")
if placeholder_extractor(source_value) != placeholder_extractor(target_value):
errors.append(f"[{resolved_locale_label}] key '{key}' 的占位符集合与 source 不一致")
return
if not isinstance(target_value, dict):
errors.append(f"[{locale}] key '{key}' 与 source 的类型不一致source=plural, target=string")
errors.append(f"[{resolved_locale_label}] key '{key}' 与 source 的类型不一致source=plural, target=string")
return
source_categories = set(source_value.keys())
target_categories = set(target_value.keys())
if source_categories != target_categories:
errors.append(
f"[{locale}] key '{key}' 的 plural category 不一致:"
f"[{resolved_locale_label}] key '{key}' 的 plural category 不一致:"
f"source={sorted(source_categories)}, target={sorted(target_categories)}"
)
for category in sorted(source_categories & target_categories):
source_placeholders = extract_placeholders(source_value[category])
target_placeholders = extract_placeholders(target_value[category])
source_placeholders = placeholder_extractor(source_value[category])
target_placeholders = placeholder_extractor(target_value[category])
if source_placeholders != target_placeholders:
errors.append(f"[{locale}] key '{key}' 的 plural category '{category}' 占位符集合与 source 不一致")
errors.append(
f"[{resolved_locale_label}] key '{key}' 的 plural category '{category}' 占位符集合与 source 不一致"
)
def get_dashboard_locales_root(locales_root: Path | None = None) -> Path:
if locales_root is not None:
return locales_root.resolve()
return (PROJECT_ROOT / "dashboard" / "src" / "i18n" / "locales").resolve()
def discover_dashboard_locales(locales_root: Path | None = None) -> list[str]:
root = get_dashboard_locales_root(locales_root)
if not root.exists():
return []
locale_names = [path.stem for path in root.glob("*.json") if path.is_file()]
return sorted(locale_names)
def is_plural_translation_node(value: object) -> bool:
if not isinstance(value, dict) or not value:
return False
return all(
isinstance(category, str) and category in PLURAL_CATEGORIES and isinstance(category_value, str)
for category, category_value in value.items()
)
def flatten_dashboard_translation_mapping(
value: dict[str, object],
file_path: Path,
translations: dict[str, TranslationValue],
parent_keys: list[str] | None = None,
) -> None:
current_parent_keys = parent_keys or []
if not value:
if current_parent_keys:
raise InvalidTranslationFileError(
f"{file_path} 中的 key '{'.'.join(current_parent_keys)}' 不能为空对象"
)
raise InvalidTranslationFileError(f"{file_path} 顶层不能为空对象")
for raw_key, raw_value in value.items():
if not isinstance(raw_key, str):
raise InvalidTranslationFileError(f"{file_path} 中存在非字符串 key")
normalized_key = raw_key.strip()
if not normalized_key:
raise InvalidTranslationFileError(f"{file_path} 中存在空字符串 key")
current_key_parts = [*current_parent_keys, normalized_key]
current_key = ".".join(current_key_parts)
if isinstance(raw_value, str):
if current_key in translations:
raise DuplicateTranslationKeyError(f"{file_path} 中存在重复 key: '{current_key}'")
translations[current_key] = raw_value
continue
if is_plural_translation_node(raw_value):
if current_key in translations:
raise DuplicateTranslationKeyError(f"{file_path} 中存在重复 key: '{current_key}'")
translations[current_key] = validate_translation_value(current_key, raw_value, file_path)
continue
if isinstance(raw_value, dict):
flatten_dashboard_translation_mapping(raw_value, file_path, translations, current_key_parts)
continue
raise InvalidTranslationFileError(f"{file_path} 中的 key '{current_key}' 必须是字符串或对象")
def load_dashboard_translation_file(file_path: Path) -> dict[str, TranslationValue]:
try:
raw_payload = json.loads(file_path.read_text(encoding="utf-8"))
except json.JSONDecodeError as exc:
raise InvalidTranslationFileError(f"{file_path} 不是合法 JSON: {exc}") from exc
if not isinstance(raw_payload, dict):
raise InvalidTranslationFileError(f"{file_path} 顶层必须是 JSON object")
translations: dict[str, TranslationValue] = {}
flatten_dashboard_translation_mapping(raw_payload, file_path, translations)
return translations
def load_dashboard_locale_catalog(
locale: str,
locales_root: Path | None = None,
) -> dict[str, TranslationValue]:
locale_file = get_dashboard_locales_root(locales_root) / f"{locale}.json"
if not locale_file.exists():
raise LocaleNotFoundError(f"未找到 locale 文件: {locale_file}")
return load_dashboard_translation_file(locale_file)
def validate_dashboard_json_locales(locales_root: Path | None = None) -> list[str]:
resolved_locales_root = get_dashboard_locales_root(locales_root)
locales = discover_dashboard_locales(resolved_locales_root)
errors: list[str] = []
if DASHBOARD_DEFAULT_LOCALE not in locales:
errors.append(f"[dashboard] 缺少默认 locale 文件: {DASHBOARD_DEFAULT_LOCALE}.json")
return errors
catalogs: dict[str, dict[str, TranslationValue]] = {}
for locale in locales:
try:
catalogs[locale] = load_dashboard_locale_catalog(locale, resolved_locales_root)
except Exception as exc:
errors.append(f"[dashboard:{locale}] 加载失败: {exc}")
source_catalog = catalogs.get(DASHBOARD_DEFAULT_LOCALE)
if source_catalog is None:
return errors
source_keys = set(source_catalog.keys())
for locale, catalog in catalogs.items():
if locale == DASHBOARD_DEFAULT_LOCALE:
continue
locale_label = f"dashboard:{locale}"
locale_keys = set(catalog.keys())
for key in sorted(source_keys - locale_keys):
errors.append(f"[{locale_label}] 缺少 key: {key}")
for key in sorted(locale_keys - source_keys):
errors.append(f"[{locale_label}] 存在多余 key: {key}")
for key in sorted(source_keys & locale_keys):
source_value = source_catalog[key]
target_value = catalog[key]
validate_translation_pair(
key,
source_value,
target_value,
locale,
errors,
placeholder_extractor=extract_i18next_placeholders,
locale_label=locale_label,
)
if isinstance(source_value, str) == isinstance(target_value, str):
validate_locale_content(key, source_value, target_value, locale, errors, locale_label=locale_label)
return errors
def validate_json_locales(locales_root: Path | None = None) -> list[str]:
@@ -219,6 +391,7 @@ def _print_warnings(warnings: list[str]) -> None:
def main() -> int:
errors = validate_json_locales()
errors.extend(validate_dashboard_json_locales())
prompt_errors, prompt_warnings = validate_prompt_templates()
errors.extend(prompt_errors)