Paper – Watch-n-Patch: Unsupervised Learning of Actions and Relations

Today I read a paper titled “Watch-n-Patch: Unsupervised Learning of Actions and Relations”

The abstract is:
There is a large variation in the activities that humans perform in their everyday lives.

We consider modeling these composite human activities which comprises multiple basic level actions in a completely unsupervised setting.

Our model learns high-level co-occurrence and temporal relations between the actions.

We consider the video as a sequence of short-term action clips, which contains human-words and object-words.

An activity is about a set of action-topics and object-topics indicating which actions are present and which objects are interacting with.

We then propose a new probabilistic model relating the words and the topics.

It allows us to model long-range action relations that commonly exist in the composite activities, which is challenging in previous works.

We apply our model to the unsupervised action segmentation and clustering, and to a novel application that detects forgotten actions, which we call action patching.

For evaluation, we contribute a new challenging RGB-D activity video dataset recorded by the new Kinect v2, which contains several human daily activities as compositions of multiple actions interacting with different objects.

Moreover, we develop a robotic system that watches people and reminds people by applying our action patching algorithm.

Our robotic setup can be easily deployed on any assistive robot.

Pin It on Pinterest