Paper Reading/Vision and Language Navigation(VLN)

PAPER Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks Vision-Language Navigation (VLN) is a task where agents learn to navigate following natural language instructions. The key to this task is to perceive both the visual scene and natural language sequentially. Conventional approaches exploit the vision and l arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로..
PAPER Transferable Representation Learning in Vision-and-Language Navigation Vision-and-Language Navigation (VLN) tasks such as Room-to-Room (R2R) require machine agents to interpret natural language instructions and learn to act in visually realistic environments to achieve navigation goals. The overall task requires competence in arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파..
PAPER Multi-modal Discriminative Model for Vision-and-Language Navigation Vision-and-Language Navigation (VLN) is a natural language grounding task where agents have to interpret natural language instructions in the context of visual scenes in a dynamic environment to achieve prescribed navigation goals. Successful agents must h arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 ..
PAPER Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention We present Vision-based Navigation with Language-based Assistance (VNLA), a grounded vision-language task where an agent with visual perception is guided via language to find objects in photorealistic indoor environments. The task emulates a real-world sce arxiv.org 논문을 깊게 읽고 만든 자료가 아니므..
PAPER Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout A grand goal in AI is to build a robot that can accurately navigate based on natural language instructions, which requires the agent to perceive the scene, understand and ground language, and act in the real-world environment. One key challenge here is to arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로..
PAPER Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모..
PAPER The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making. Specifically, the Vision and Language Navigation (VLN) task involves navigating to a goal purely f arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위..
PAPER 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 내용도 많습니다. 혹시 사용하실 경우 댓글 부탁드립니다.
PAPER Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments. In this paper, we study how to address three critical challenges for this task: the cross-modal grounding, arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참..
PAPER Neural Modular Control for Embodied Question Answering We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialize arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 내용..
PAPER Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigatio Existing research studies on vision and language grounding for robot navigation focus on improving model-free deep reinforcement learning (DRL) models in synthetic environments. However, model-free DRL models do not consider the dynamics in the real-world arxiv.o..
PAPER Self-Monitoring Navigation Agent via Auxiliary Progress Estimation The Vision-and-Language Navigation (VLN) task entails an agent following navigational instruction in photo-realistic unknown environments. This challenging task demands that the agent be aware of which instruction was completed, which instruction is needed arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자..
PAPER Unified Pragmatic Models for Generating and Following Instructions We show that explicit pragmatic inference aids in correctly generating and following natural language instructions for complex, sequential tasks. Our pragmatics-enabled models reason about why speakers produce certain instructions, and about how listeners arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료..
PAPER Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation We propose to take a novel approach to robot system design where each building block of a larger system is represented as a differentiable program, i.e. a deep neural network. This representation allows for integrating algorithmic planning and deep learnin arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식..
PAPER QMDP-Net: Deep Learning for Planning under Partial Observability This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy fo arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이..
Js.Y
'Paper Reading/Vision and Language Navigation(VLN)' 카테고리의 글 목록 (2 Page)