연구
Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability
arXiv:2603.10384v1 Announce Type: new Abstract: Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We introduce TRACED, a framework that assesses reasoning quality through theoretically grounded geometric kinematics.
이 콘텐츠는 ArXiv AI 원본 기사의 요약입니다. 전문은 원본 사이트에서 확인해주세요.
원문 기사 보기 →