Lehrinhalte
Dieses Seminar für Fortgeschrittene stellt grundlegende Algorithmen für [b]Robotic[/b] [b]Embodied AI Systems (REAIS)[/b] vor, die Objekte in unstrukturierten Umgebungen wie Wohnungen, Restaurants, Supermärkten usw. autonom wahrnehmen, navigieren und manipulieren können. 

Es befasst sich mit der komplexen und aktuellen Herausforderung, intelligente Roboteragenten zu verstehen und zu entwickeln, die interagieren und ihre Welt verändern können. Das Seminar wird grundlegende Probleme der verkörperten KI und Robotik erörtern und dabei die [b]multimodale Wahrnehmung mit dem Handeln[/b] verbinden.

Das Seminar wird eine Einführungsvorlesung und eine Lesegruppe kombinieren, um fortgeschrittene algorithmische Ansätze in der Robotik und der verkörperten KI zu diskutieren und zu erlernen. 

In diesem Semester lautet das Thema des Seminars "Interaktive Roboter Wahrnehmen und Lernen".

Eine vorläufige Liste der Referate umfasst:

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations  https://arxiv.org/abs/2104.01542
Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects https://arxiv.org/abs/2209.05802
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models  https://arxiv.org/abs/2207.11514
The (Un)Surprising Effectiveness of Pre-Trained Vision Models for Control  https://arxiv.org/abs/2203.03580
R3M: A Universal Visual Representation for Robot Manipulation  https://arxiv.org/abs/2203.12601
Real-World Robot Learning with Masked Visual Pre-training  https://arxiv.org/abs/2210.03109
Offline Visual Representation Learning for Embodied Navigation  https://arxiv.org/abs/2204.13226
The Surprising Effectiveness of Representation Learning for Visual Imitation  https://arxiv.org/abs/2112.01511
VideoDex: Learning Dexterity from Internet Videos https://arxiv.org/abs/2212.04498
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances  https://arxiv.org/abs/2204.01691
CLIPort: What and Where Pathways for Robotic Manipulation  https://arxiv.org/abs/2109.12098
VIMA: General Robot Manipulation with Multimodal Prompts  https://arxiv.org/abs/2210.03094
GATO: A Generalist Agent https://arxiv.org/abs/2205.06175
PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pre-Training  https://arxiv.org/abs/2209.11133
Learning Universal Policies via Text-Guided Video Generation  https://arxiv.org/abs/2302.00111

Literatur
Wir empfehlen, sich den Online-Kurs über moderne Robotik anzusehen: [url]https://youtube.com/playlist?list=PLggLP4f-rq02vX0OQQ5vrCxbJrzamYDfx[/url]

Voraussetzungen
Empfohlen:
Studierende sollten über grundlegende Kenntnisse in Robotik. Außerdem wird Grundlagen der Robotik, Lobot Learning und/oder Computer Vision I empfohlen.

Online-Angebote
Moodle.

Course Contents
This advanced seminar introduces fundamental algorithms for[b] Robotic Embodied AI Systems (REAIS)[/b] that can autonomously perceive, navigate and manipulate objects in unstructured environments like homes, restaurants, supermarkets, etc.

It addresses the complex and timely challenge of understanding and developing intelligent robotic agents that can interact and change their world. The seminar will discuss fundamental problems in embodied AI and robotics connecting [b]Multimodal Perception to Action[/b].

The seminar will combine an introductory lecture and a reading group to discuss and learn about advanced algorithmic approaches in robotics and embodied AI.

This semester the theme of the seminar is "[b]Interactive Robot Perception and Learning[/b]".

A tentative list of papers includes:

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations  https://arxiv.org/abs/2104.01542
Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects https://arxiv.org/abs/2209.05802
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models  https://arxiv.org/abs/2207.11514
The (Un)Surprising Effectiveness of Pre-Trained Vision Models for Control  https://arxiv.org/abs/2203.03580
R3M: A Universal Visual Representation for Robot Manipulation  https://arxiv.org/abs/2203.12601
Real-World Robot Learning with Masked Visual Pre-training  https://arxiv.org/abs/2210.03109
Offline Visual Representation Learning for Embodied Navigation  https://arxiv.org/abs/2204.13226
The Surprising Effectiveness of Representation Learning for Visual Imitation  https://arxiv.org/abs/2112.01511
VideoDex: Learning Dexterity from Internet Videos https://arxiv.org/abs/2212.04498
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances  https://arxiv.org/abs/2204.01691
CLIPort: What and Where Pathways for Robotic Manipulation  https://arxiv.org/abs/2109.12098
VIMA: General Robot Manipulation with Multimodal Prompts  https://arxiv.org/abs/2210.03094
GATO: A Generalist Agent https://arxiv.org/abs/2205.06175
PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pre-Training  https://arxiv.org/abs/2209.11133
Learning Universal Policies via Text-Guided Video Generation  https://arxiv.org/abs/2302.00111

Literature
We recommend watching the online course on Modern Robotics: [url]https://youtube.com/playlist?list=PLggLP4f-rq02vX0OQQ5vrCxbJrzamYDfx[/url]

Preconditions
Recommended:
The students should have fundamental knowledge in robotics, and linear algebra. Furthermore, Fundamentals of Robotics, Robot Learning and/or Computer Vision I is recommended.

Online Offerings
Moodle.

Semester: SoSe 2024