Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4 | xiaojuan’s blog

xiaojuan’s blog

Welcome to xiaojuan’s blog. I will share you with some articles of study and life.

Evaluating the Logical Reasoning Ability of ChatGPT and GPT-4

| Jan 23, 2024

Words≈0 | Read Time ≈ 0 min

link

Publish Date

Number

reflection

abstract

This report analyses multiple logical reasoning datasets, with popular benchmarks like LogiQA and ReClor, and newly-released datasets like AR-LSAT. GPT-4和ChatGPT在trandition benchmark上表现还可以，但是在OOD上表现很差。

Status

Not started

Type

evaluation

Author

Valine

Catalog