When can transformers reason with abstract symbols?
| Jan 23, 2024
0  |  Read Time 0 min
link
Publish Date
Number
reflection
abstract
理论上来说明,经过大量的ft,transformer可以完成template matching(模版一样,只是variables不一样)即所说的一阶泛化。但另外是对于symbolic task,它不能。 For (i) regression tasks, we prove that transformers generalize when trained, but require astonishingly large quantities of training data. For (ii) next-token-prediction tasks with symbolic labels, we show an “inverse scaling law”: transformers fail to generalize as their embedding dimension increases.
Status
In progress
Type
evaluation
Author
  • Valine
Catalog