FAWAM: Force-Aware World Action Models for Closed-Loop Contact-Rich Manipulation

Haotian He*,1, Zeyu Yan*,2, Qipeng Liu2, Ning Guo2, Wenzhao Lian2,†

* Equal Contribution; Corresponding author

1 School of Mathematical Sciences, Peking University
2 School of Artificial Intelligence, Shanghai Jiao Tong University

Abstract

Force signals provide critical interaction cues for contact-rich robotic manipulation. However, existing methods mostly use force as an additional observation modality, without fully exploiting its role in modeling future interaction dynamics or guiding execution-time feedback correction. In this paper, we propose FAWAM, a force-aware world action model that incorporates force information at three levels: perception, prediction, and closed-loop execution. FAWAM first encodes historical 6-axis force/torque signals to modulate action generation, then jointly predicts future actions and end-effector wrenches to explicitly model contact evolution. It further introduces a residual correction module that uses the predicted wrench trajectory as an execution-time reference to refine actions online based on real-time force feedback. Real-world experiments across multiple contact-rich tasks show that FAWAM improves the average success rate by 36.25% over vision-only baselines and 21.25% over existing force-aware baselines, demonstrating the effectiveness of our force-aware framework for robust contact-rich manipulation.

Method Overview

Force-Envisioned Action Model overview

FAWAM consists of a Force-Envisioned Action Model running at 1Hz and a Force-Guided Residual Corrector running at 10Hz. The Force-Envisioned Action Model encodes recent force/torque histories into compact contact features and injects them into action generation through AdaLN-Zero-style modulation. It also jointly predicts future action and force trajectories, encouraging the model to learn the coupling between robot motions and their contact consequences. The Force-Guided Residual Correction module learns from human interventions and corrects the base action online with a residual action and an intervention gate. It uses the predicted force trajectory as guidance for online adaptation to unexpected contact changes within an action chunk. By comparing the predicted force reference with real-time force feedback, it provides timely residual corrections and improves execution-time robustness.

Experimental Results

Real-World Rollout Comparison

We evaluate FAWAM on four contact-rich manipulation tasks: erasing a whiteboard, peeling a cucumber, pivoting a box, and wiping a vase. Each task requires precise force regulation and adaptation to varying contact conditions. GE-Act is a vision-only baseline that generates actions from visual observations, while force-conditioned GE-Act extends GE-Act by adding force as an extra observation input. The videos below show representative rollout comparisons between these two baselines and FAWAM across the four tasks. All videos on this page are played at 2x speed.

Task
GE-Act
FORCE-CONDITIONED GE-Act
FAWAM
Erase Board
Peel Cucumber
Pivot Box
Wipe Vase

Perturbation Rollouts Comparison

To evaluate the effectiveness of FAWAM's online residual correction, we introduce execution-time perturbations across all four tasks. The videos below compare FAWAM with and without the Force-Guided Residual Corrector. With online correction, FAWAM uses discrepancies between predicted and measured wrenches to adapt its actions and recover stable contact. Without online correction, the policy often fails to recover from these perturbations. All videos on this page are played at 2x speed.

Task
Without Correction
With Correction
Erase Board
Peel Cucumber
Pivot Box
Wipe Vase

Quantitative Results

Main real-world evaluation results
Main real-world evaluation results. Avg. SR denotes average success rate.

Citation

@misc{he2026fawam,
  title         = {FAWAM: Force-Aware World Action Models for Closed-Loop Contact-Rich Manipulation},
  author        = {Haotian He and Zeyu Yan and Qipeng Liu and Ning Guo and Wenzhao Lian},
  year          = {2026},
  eprint        = {2606.08555},
  archivePrefix = {arXiv},
  primaryClass  = {cs.RO},
  url           = {https://arxiv.org/abs/2606.08555}
}