This dataset is collected using Tobii eye-tracking glasses. It consists of 17 sequences, performed by 14 different subjects.
In order to record the sequences, we stuffed a table with various kinds of food, dishes and snacks.
We asked each subject to wear the Tobii glasses and calibrated the gaze.
Then we asked the subject to take a sit and make whatever food they feel like having.
The beginning and ending time of the actions are annotated.
Each action consists of a verb and a set of nouns. For example pouring milk into cup.
In our experiments we extract images from video at 15 frames per second. Action annotations are based on frame numbers.
The following sequences are used for training: 1, 6, 7, 8, 10, 12, 13, 14, 16, 17, 18, 21, 22 and the following sequences are used for testing: 2, 3, 5, 20.
Download this dataset here.