-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
ms-swift v4将会对整体框架进行重构,这会引入不少break change。以下为会导致break change的重构内容:
- 目录结构重构与依赖关系优化:swift.llm目录将拆分为swift.template, swift.dataset, swift.model, swift.pipelines。
- model_type与template接耦,与transformers model_type对齐,将去除qwen3_thinking, qwen3_no_thinking, qwen3_moe_thinking等额外的model_type。
- megatron训练 training loop重写,支持RAY。使用megatron-core而丢弃使用megatron-lm依赖。
- transformers v5.0新特性引入。
如果有更多重构/新特性支持的建议,可以在下方留言,该文档会持续更新,谢谢!
ms-swift v4 will undergo a comprehensive framework refactoring, which will introduce several breaking changes. The following are refactoring items that will cause breaking changes:
- Directory structure refactoring and dependency optimization: The swift.llm directory will be split into swift.template, swift.dataset, swift.model, and swift.pipelines.
- Decoupling model_type from template, aligning with transformers model_type. Additional model_types such as qwen3_thinking, qwen3_no_thinking, qwen3_moe_thinking, etc. will be removed.
- Megatron training loop rewrite to support RAY. Will use megatron-core and drop the megatron-lm dependency.
- Introduction of transformers v5.0 new features.
If you have more suggestions for refactoring/new feature support, please leave a comment below. This document will be continuously updated. Thank you!
Metadata
Metadata
Assignees
Labels
No labels