Skip to content

Latest commit

 

History

History
53 lines (34 loc) · 10.2 KB

Wizard.md

File metadata and controls

53 lines (34 loc) · 10.2 KB

Wizard Interface

The wizard interface was created in Unity3D and C# is used for most of the scripting in the project. Although Unity allows for crossplatform development and deployment, some of the modules (Speech Recognition, GazeSense and Tobii) in this project will only work in Windows 10.

GameObjects

GameObjects are the baseclass for all objects in Unity3D and we use them to organize and divide our project.

Speech Recognition

The Speech Recognition game object starts the SpeechRecognition.cs script. This class initializes a Unity's DictationRecognizer object. This allows us to use Windows 10 Online Speech Recognition with a few lines of code within Unity. Don't forget to enable Online Speech Recognition in Windows (in the target language) to get this to work. Final results and intermediate results of speech recognition will result in updating the PlayerUI(described below) with the text that is being said. Intermediate speech recognition results or hypothesis are presented with a … punctuation at the end of the detected utterance and once the result is known the punctuation is removed. The result stays in the interface for a few more seconds or until a new recognition starts after detection of the end of utterance. Additionally, a child game object contains a unity plugin that allows us to play back the text of speech recognition results back to the wizard in a 2X speed.

Logging

The logging game object starts the dbg.cs script. Dbg is a simple singleton class that can be accessed from any other class to log any given content to a 'Logs' folder. If a 'Logs' folder does not exist, it creates the folder automatically.

ARToolkit

The ARToolkit game object starts the ARController and several ARMarker scripts developed in the ARToolkit platform, click here for more details.

Robot

The Robot game object starts 4 different scripts.

  • A gazeDrawing.cs script used to draw two lines from the robot's eyes to the point in the interaction where the robot is looking at.
  • A Behaviors.cs script that reads a data file and can generate random (optimized for decreased repeatability) behaviors. To edit the robot's behavior in each dialog act, one should edit the utterances.tsv file, a tab separated file that can be easily authored to change the robot's behaviors. Our dialog act implementation allows for a mix of verbal and non-verbal content in a format that is compact for authoring. It starts by vocalizing and lip-syncing the text but also supports additional commands that can be synchronized in the middle of the content such as gaze changes, gestures, pauses or pitch, and volume changes. A simple notation for variable replacement also allows the authored content to contain dynamic interaction information such as the user's name or the current referred-to piece color. Each content entry can also be associated with a facial blendshape (Mood) that changes the robot's underlying facial expression throughout the entire length. Finally, behaviors are also associated with a gaze target that affects the autonomous gaze behavior. Please consult the utterances file to learn more about its structure.
  • A AutonomousGazeBehavior.cs script used to generate autonomous responsive gaze behavior for the robot, click here for more details.
  • A Furhat.cs .net library that allows us to communicate with the Furhat robot inside Unity3D, click here for more details. This .net dll library communicates with IRISTK and extends Furhat's robot functionality and facilitates communication with Furhat within a C# environment. To use the library set the robot to passive mode and provide the correct IP address in the Inspector. In the inspector you can also set the distance between Furhat and each square on the board so that Furhat can look at the right point in space.

Main Camera

This gameobject contains the Unity3D camera used for displaying the user interface in a secondary display. The main display is reserved for the ARToolkit camera. As such, to run this project you will require 2 different monitors (with a recommended resolution of 1080p).

Canvas

This game object defines an area where all the interface elements are drawn. To facilitate organization, we created different child game objects for each part of the interface.

Multimodal Perception

This game object draws in the interface all the multimodal information that is captured by the system and that can be perceived by the wizard. This information consists of the player's and robot's action and the MagPuzzle pieces. We divided it further into a RobotUI game object, a PlayerUI game object and a Board game object.

  • Associated with the RobotUI and PlayerUI game objects there are two interface elements that are controlled by external scripts: line renderers that represent gaze direction; and a chatbox that represents the text of what the user or the robot is currently saying.

  • The Board game object contains several images that represent the physical pieces, hints and quadrants. These images can be activated or deactivated within the other scripts in the application.

  • The DialogInfo.cs script allows the automatic generation of wizard button game objects drawn around a central point. The position of each button and the behaviors each button triggers can be easily edited in the Unity Inspector. We use this script in every puzzle piece, hint and quadrant so that the wizard can quickly react to events in MagPuzzle and trigger robot's behaviors that target those elements (changes the robot's gaze and initiates a vocalization). This script is also associated to the PlayerUI and RobotUI, in which case the robot can talk about itself (e.g., to clarify role, to re-introduce himself or to simulate thinking) or directly to the user (e.g., to remind the object, provide reassurance, probe for engagement/help). In both these cases, the robot establishes gaze with the user.

Gaze

This game object contains two main scripts. It contains a TcpClientGaze.cs that receives data from an external gaze tracker and communicates that information with the autonomous gaze behavior described in the robot section. Gaze information from the user also drawn in the interface. For this, we use the gazeDrawing.cs script to draw two lines from the user's eyes to the focused point in the interaction. When the system detects that the user is gazing at a point of our board or at the robot, the wizard becomes aware of where the user is looking at. If no gaze is detected, the line is transformed into two simple dots that represent the eyes. In this repository, we also provide the python code necessary to communicate between GazeSense and our TcpClientGaze. That code can be initialized before starting the interface or after as we constantly try to establish a connection with it.

Wizard Management

This game object contains two main scripts:

  • The WizardManager.cs script is responsible for creating the data structures that recognize and hold the MagPuzzle board state. It gets its data from the ARToolkit and processes that data to keep the state updated. It also responsible for initiating behaviors by issuing the appropriate behavior command in the Furhat robot and by communicating with the autonomous gaze behavior to change the gaze target. Finally, this script also divides the interface into two separate screens, one containing the wizard interface and other with the feed of the camera being used by the ARToolkit.

  • The ReduceCognitiveLoad.cs is responsible for hiding or showing dialog options in the interface by only showing the appropriate interface options at the right time. It contains a brute force search algorithm that checks the board for a solution and for correct or incorrect board states. If the state is correct but a solution is not yet reached, hints are shown for possible squares. If the state is incorrect, a symbol in the interface changes from correct to incorrect and dialog options that reveal why the state is incorrect are drawn in the interface. This allows the wizard to always provide prompt and correct advice that guides users towards the correct solution of the task. Additionally, hardcoded rules are added to hide interface elements once they are not useful anymore.

Wizard Management also contains two additional game objects that serve as a placeholder:

  • Behavior buttons - Our wizard interface allows the wizard to select buttons in the interface with the use of a mouse or touchscreen interface that are associated with a Dialog Act. These buttons have in their text label, the name of the dialog acts. The wizard interface is easily costumizable by adding a prefab object for any part in the interface for each desired button. The prefab and the corresponding buttons contain the WizardButtonPressed.cs script. This script gets a reference to and uses the Wizard Manager script described above to execute the selected dialog choice/act.

  • Session Control - For session control, some interface elements are present to control the condition of the study. These elements use a Settings.cs script that can either activate (Full Joint Attention condition) or disable (Control condition) the reactive layer of the responsive gaze system autonomous gaze system.