Unveiling Transformers with LEGO: a synthetic reasoning taskbb
| Jan 23, 2024
0  |  Read Time 0 min
link
Publish Date
Number
reflection
abstract
用一个可控的数据集LEGO来探究transformer在模型训练是如何工作的,说明pre-training即便不相关的任务也很重要,以及chain of reasoning 可能会学习到某些shortcut
Status
Type
evaluation
Author
  • Valine
Catalog