mintguy

INTRODUCTION

MINT is a software framework for adding multi-touch support to applications. MINT enables to develop multi-touch gestures using the Python scripting language and incorporates new concepts for the generalization and implementation of interaction techniques. The MINT framework encapsulates the interaction relevant parts of an application so gestures are not implemented within the scope of the application and are therefore not intertwined with its run-time logic. This allows changing and adapting gestures without having to make changes to the application itself and easily reuse them in other applications. Usually application developers are no experts in interaction design and should not be bothered with having to implement how the application is interacted with. Therefore, a special focus of MINT lies on the usability of the provided solutions.

Gestures can be designed and developed by dedicated interaction designers using the provided features without the involvement of application programmers. This enforced separation of interaction and application development helps putting a special focus on either part and to improve the overall design of an application. MINT reduces the effort needed to implement own multi-touch interactions to encourage and support the development of gestures which are well adapted to the requirements of applications. This is especially crucial for multi-touch scenarios so intuitive control mechanisms for applications can be created. Therefore, developing applications using MINT and its new approach of developing multi-touch interactions can lead to better applications in terms of design and usability.


WHY YOU SHOULD START USING MINT

MODULARIZATION

The generalization and abstraction of interaction techniques results in atomic gesture parts (Interaction Building Blocks) which are used to compose more complex gestures. Available frameworks lack such functionality and so developers end up working with raw input data within the application for creating gestures what leads to proprietary solutions and duplicated code. Having building blocks at hand that all implement certain functionality leads to a clear and modular design without the risk of duplicated code or having to implement some functionality twice. With this approach, building blocks as well as whole gestures can be easily exchanged, what is great for prototyping and testing different interactions without the need of adapting the application.

REUSABILITY

With MINT it is possible to outsource the definition as well as implementation of gestures to the framework what allows to comfortably reuse already developed gestures in other applications without any additional effort. Further, the generalization of interaction techniques enables to reuse single parts of a gesture for the definition of another one. When using currently available multi-touch frameworks gestures mostly have to be implemented within the application and so it can get complicated and take additional effort to reuse them in different applications.

HARDWARE INDEPENDENCE

An important feature of a multi-touch framework is that it can be used in combination with different input devices. Vendors of commercially available multi-touch hardware often provide a proprietary solution for communicating detected touch points to the application using a device dependent SDK. Consequently, this leads to the development of device dependent applications. By integrating the TUIO protocol the MINT framework can be interfaced using different devices without the need to adapt the application itself. TUIO is the de-facto standard for communicating with multi-touch hardware and based on Open Sound Control (OSC). Information about detected touch points and objects is sent over the network using UDP. Furthermore, support for WM



FLEXIBILITY

The generalization and abstraction of interaction techniques results in atomic gesture parts (Interaction Building Blocks) which are used to compose more complex gestures. Available frameworks lack such functionality and so developers end up working with raw input data within the application for creating gestures what leads to proprietary solutions and duplicated code. Having building blocks at hand that all implement certain functionality leads to a clear and modular design without the risk of duplicated code or having to implement some functionality twice. With this approach, building blocks as well as whole gestures can be easily exchanged, what is great for prototyping and testing different interactions without the need of adapting the application.

3D INTERACTION

With MINT it is possible to outsource the definition as well as implementation of gestures to the framework what allows to comfortably reuse already developed gestures in other applications without any additional effort. Further, the generalization of interaction techniques enables to reuse single parts of a gesture for the definition of another one. When using currently available multi-touch frameworks gestures mostly have to be implemented within the application and so it can get complicated and take additional effort to reuse them in different applications.

WHAT MINT IS NOT

MINT concentrates on the interaction relevant parts of an application and does not provide any ready-to-use GUI elements. Its sole purpose is to enable controlling an application using multi-touch interactions, whereas the GUI of an application can be developed using a graphics API of choice.

mintguy

GENERALIZATION OF INTERACTION TECHNIQUES

INTERACTION BUILDING BLOCKS

To obtain the necessary abstraction for a high-level definition of multi-touch gestures, interaction techniques have to be broken down to their atomic parts, reducing them to their relevant characteristics. The resulting atomic Interaction Building Blocks (IBB) then can be assembled to compound interactions. Examples for compound interactions are drag-and-drop, drag-and-pop, push-and-pop or push-and-throw which consist of the atomic interactions drag, drop, pop, push and throw. By sufficiently abstracting interaction techniques like drag-and-drop and push-and-pop, a new one like drag-and-pop can be created by simply combining the atomic parts drag and pop without the need of writing additional code.

This allows for high-level gesture development optimal for interaction prototyping and testing different gestures for the same purpose, as well as for creating new gestures with very little extra effort. Gradually, with increasing use and development of IBBs and gestures, a whole database of IBBs and interactions can form, basically serving as gesture construction kit. Since IBBs and gestures can be defined and implemented in an application independent way, it is also possible to share them among other developers and users. This distribution and exchange can further push the development of multi-touch gestures since other people can pick up on ideas, test and adapt gestures for specific interactions.


GESTURE STATE MACHINES

Compound interactions consist of different atomic states and transitions that are being made based on a specified condition. Therefore, to be able to model these complex interactions, it stands to reason to use the concept of Finite State Machines (FSM). This is a well-known and intuitive concept to model problems and processes not only in the field of computer science that most developers should be familiar with. After all, only a subset of the functionality of FSMs will be required to be able to model multi-touch gestures what should lead to an intuitive definition of interactions. State machines are a good way for modeling the lifetime of a process shown in a very basic example in figure 1. States are used to mark different situations a process can reside in. Changing states is accomplished using transitions which may be triggered by a certain event. A transition can contain a guard condition which is a Boolean operation that checks whether the transition should be made.

Using state machines as a mechanism for generalizing and defining multi-touch interactions offers features which allow the simultaneous support of different kinds of gestures. For example, multi-step or multi-stroke gestures can be modeled easily using state machines.

MINT - Gesture Flowchart1
figure 1
mintguy

DEVELOPMENT OF INTERACTIONS

The just described method for the generalization of interaction techniques allows defining multi-touch gestures at a high level of abstraction and the use of atomic IBBs provides great modularization and reusability. To be able to develop all interaction relevant parts outside the application, MINT provides a programming interface (Interaction API Documentation comming soon).

which offers domain-specific operations needed for the development of multi-touch interactions. These include intersection testing, temporal and spatial operations as well as the extraction of geometric features. In addition to defining gestures using state machines, this adds the necessary functionality for creating gestures in a flexible and application independent way.


INTERACTION SCRIPTING

To be able to develop interactions independent of the application without having to rely on the programming language the framework is developed with, the functionality of the Interaction API is exported as a module for the Python programming language. Using this popular scripting language has the advantage that many developers are already familiar with it and know its syntax. Further, scripting languages are known for being easy to use and learning an existing language is supported by already available documentation and tutorials. As a result of being able to interface with MINT using Python, an IBB consists of a scripting file which contains statements implementing the specific functionality of an atomic part of a multi-touch gesture. Listing 1 demonstrates how such a script might look like. In this example the objects intersected by an incoming touch point are queried and corresponding events are created.

#!A Python script for implementing an 
#!IBB for the selection of objects

touchPoint = mint.getNewTouchPoint()
objects = mint.getIntersectedObjects(touchPoint)
for object in objects:
mint.sendSelectEvent(object)
listing 1

The optional guard condition on which the transition to a next state depends is also implemented in a scripting file. This may be used, for example, to distinguish gestures that require a specific number of fingers to be carried out. The script which implements the transition condition then has to decide whether the transition should be made or not, as demonstrated in listing 2.

In this example, the touch points which are already registered with a gesture are queried and if exactly two touch points are available the transition is made.

#!A script deciding whether a transition should 
#!be made based on the number of touch points involved

touchPoints = mint.getRegisteredTouchPoints()
if len(touchPoints) == 2:
mint.signalTransition()
listing 2

As a consequence, two different types of scripting files for implementing gestures exist, as shown in figure 2 on the basis of the previous drag-and-drop example (figure 1). These are transition scripts for implementing transition conditions and event scripts for developing IBBs which are called when a state is entered or left.

MINT - Gesture Flowchart2
figure 2

GESTURE DEFINITIONS

The basis for the definition of multi-touch gestures for use with MINT is a collection of Interaction Building Blocks (IBB) obtained by the generalization of interaction techniques. These building blocks form a database of scripting files which encapsulate atomic and reusable functionality of gestures and can be assembled to state machines for defining multi-touch gestures.

SCXML

Gesture state machines are defined using a standard for a generic description of state machines using XML developed by the World Wide Web Consortium (W3C) called State Chart XML: State Machine Notation for Control Abstraction (SCXML). SCXML offers all constructs needed for the definition of gesture state machines. The location of the scripting files which are used for states and transitions is specified using the <script> element of the SCXML description. Examples of gesture state machines defined using SCXML can be found in the Tutorial section.

INTERACTION LIBRARY

The SCXML files containing the gesture definitions, which are needed by an application and therefore should be loaded by MINT, have to be added to the Interaction Library. This is an XML description that defines pairs of required SCXML files together with a unique name for the interactions. The Interaction Library is loaded and parsed by MINT for replicating the specified gesture state machines within the framework.

mintguy

DOWNLOADS

PAPERS

Ferdinand Pilz: MINT - A Framework for the Design and Development of Multimodal Interaction on Multi-touch Surfaces.
Master's thesis, Vienna University of Technology, November 2011.

TU ViennaMASTER'S THESIS
PDF Download

mintguy