Paper Reading/Vision and Language Navigation(VLN)

PAPER Chasing Ghosts: Instruction Following as Bayesian State Tracking A visually-grounded navigation instruction can be interpreted as a sequence of expected observations and actions an agent following the correct trajectory would encounter and perform. Based on this intuition, we formulate the problem of finding the goal lo arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이..
PAPER Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Following a navigation instruction such as 'Walk down the stairs and stop at the brown sofa' requires embodied AI agents to ground scene elements referenced via language (e.g. 'stairs') to visual content in the environment (pixels corresponding to 'stairs' arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위..
PAPER BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps Learning to follow instructions is of fundamental importance to autonomous agents for vision-and-language navigation (VLN). In this paper, we study how an agent can navigate long paths when learning from a corpus that consists of shorter ones. We show that arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵..
PAPER Vision-Dialog Navigation by Exploring Cross-modal Memory Vision-dialog navigation posed as a new holy-grail task in vision-language disciplinary targets at learning an agent endowed with the capability of constant conversation for help with natural language and navigating according to human responses. Besides th arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 ..
PAPER VALAN: Vision and Language Agent Navigation VALAN is a lightweight and scalable software framework for deep reinforcement learning based on the SEED RL architecture. The framework facilitates the development and evaluation of embodied agents for solving grounded language understanding tasks, such as arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 내용도 많습니다. 혹시 ..
PAPER Cross-Lingual Vision-Language Navigation Vision-Language Navigation (VLN) is the task where an agent is commanded to navigate in photo-realistic environments with natural language instructions. Previous research on VLN is primarily conducted on the Room-to-Room (R2R) dataset with only English ins arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 내용도 많습니다. 혹시 사용하..
PAPER Environment-agnostic Multitask Learning for Natural Language Grounded Navigation Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e.g., following natural language instructions or dialog. However, existing methods tend to overfit training data in seen environments and fail to arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의..
PAPER Multi-View Learning for Vision-and-Language Navigation Learning to navigate in a visual environment following natural language instructions is a challenging task because natural language instructions are highly variable, ambiguous, and under-specified. In this paper, we present a novel training paradigm, Learn arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 자료이다 보니 없는 내용..
PAPER Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning Mobile agents that can leverage help from humans can potentially accomplish more complex tasks than they could entirely on their own. We develop "Help, Anna!" (HANNA), an interactive photo-realistic simulator in which an agent fulfills object-finding tasks arxiv.org 논..
PAPER Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training Learning to navigate in a visual environment following natural-language instructions is a challenging task, because the multimodal inputs to the agent are highly variable, and the training data on a new task is often limited. In this paper, we present the arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 ..
PAPER A Behavioral Approach to Visual Navigation with Graph Localization Networks Inspired by research in psychology, we introduce a behavioral approach for visual navigation using topological maps. Our goal is to enable a robot to navigate from one location to another, relying only on its visual input and the topological map of the env arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위..
PAPER Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation Vision-and-Language Navigation (VLN) is a challenging task in which an agent needs to follow a language-specified path to reach a target destination. In this paper, we strive for the creation of an agent able to tackle three key issues: multi-modality, lon arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주..
PAPER Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation Advances in learning and representations have reinvigorated work that connects language to other modalities. A particularly exciting direction is Vision-and-Language Navigation(VLN), in which agents interpret natural language instructions and visual scenes arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만..
PAPER General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping In instruction conditioned navigation, agents interpret natural language and their surroundings to navigate through an environment. Datasets for studying this task typically contain pairs of these instructions and reference trajectories. Yet, most evaluati arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로..
PAPER Robust Navigation with Language Pretraining and Stochastic Sampling Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments. In this paper, we report two simple but hig arxiv.org 논문을 깊게 읽고 만든 자료가 아니므로, 참고만 해주세요. 얕은 지식으로 모델의 핵심 위주로만 파악한 ..
Js.Y
'Paper Reading/Vision and Language Navigation(VLN)' 카테고리의 글 목록