Js.Y
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation