Existing Embodied Question Answering (EQA) benchmarks primarily
focus on household environments, often overlooking
safety-critical aspects and reasoning processes pertinent to
industrial settings. This drawback limits the evaluation of
agent readiness for real-world industrial applications.
To bridge this gap, we introduce IndustryEQA,
the first benchmark dedicated to evaluating embodied
agent capabilities within safety-critical warehouse scenarios.
Built upon the NVIDIA Isaac Sim platform, IndustryEQA provides
high-fidelity episodic memory videos featuring diverse
industrial assets, dynamic human agents, and carefully designed
hazardous situations inspired by real-world safety guidelines.
The benchmark includes rich annotations covering six categories:
equipment safety,
human safety, object
recognition, attribute recognition, temporal understanding, and
spatial understanding. Additionally, it provides extra reasoning
evaluation based on these categories.
We propose a comprehensive evaluation framework, including
various baseline models, to assess their general perception and
reasoning abilities in industrial environments. IndustryEQA aims
to steer EQA research towards developing more robust,
safety-aware, and practically applicable embodied agents for
complex industrial environments.