Architecture :: Learning Process :: Engagement vs. Non-Engagement :: Experimental Setup :: Hypothesis

Geo's Architecture

 

The architecture is divided into 3 main sections; Sensors, Actions and Learning

Geo's Architecture

Sensors:

The sensor's component of the architecture is what takes the user's input from the tactile sensors and processes them accordingly. There are three different actions the sensors component will take from the tactile feedback.

Guidance is where the human user has touched one of the appendages of Geo and now Geo is wanting to feed that information into the action component so that Geo can respond to the request for guidance and perform an action related to the action given.

The reward action is where the Geo has detected that the user has either patted him on the back or swatted him on the butt which will result in either positive or negative feedback respectively. The reward will then get sent to the learner component which uses the action performed and the reward given in its learning algorithm. Only one reward is recorded if multiple positives or multiple negatives are given.

The reset action is where the user touches the reset button which sends Geo back to the initial state or the first step of the dance.

Actions:

The action component is comprised of the behavior engine, which selects the next action to perform, and the action executor which actually performs the selected action. The behavior engine gets the current action table for the given state and uses a simple Epsilon-Greedy algorithm to select the action to perform. The epsilon is specified by the confidence level of the state, high confidence :: low epsilon, low confidence :: high epsilon. The behavior engine also transitions to the step of the dance (state) when it has performed a number of actions without user feedback. The number of actions required before the state transition is determined by the confidence level where a high confidence will transition quickly and a low confidence will not.

Learning:

The learning component, gone into further detail in the Learning processes section, is where the tables and rewards are updated and the best action to the current state is mapped.