연구
TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?
arXiv:2603.00285v1 Announce Type: new Abstract: Evaluating AI agents in finance faces two key challenges: static benchmarks require costly expert annotation yet miss the dynamic decisionmaking central to realworld trading, while LLMbased judges introduce uncontrolled variance on domainspecific...
arXiv:2603.00285v1 Announce Type: new Abstract: Evaluating AI agents in finance faces two key challenges: static benchmarks require costly expert annotation yet miss the dynamic decisionmaking central to realworld trading, while LLMbased judges introduce uncontrolled variance on domainspecific tasks. We introduce TraderBench, a benchmark that addresses both issues.
이 콘텐츠는 ArXiv AI 원본 기사의 요약입니다. 전문은 원본 사이트에서 확인해주세요.
원문 기사 보기 →