Knowing or recognising how people feel when they use an app or application or during any other form of interaction with machines is the basis for enabling a human response behaviour and making sure that they feel understood. Such technologies make it possible to support people in production and control processes in the performance of their activities and to increase safety.
I would like to start with the concept of emotion recognition. This is understood to mean various methods that (should) allow assessing what emotional state a person is currently experiencing and what emotional stress results from it. In science, it has become established practice to raise the basic emotions, which are present in different proportions at one time (there is never a basic emotion alone). In this case, they are:

  • Afraid (worried/frightened/annoyed)
  • Angry (aggravated/irate/incensed)
  • Disgusted (disgusted/indignant/revolted)
  • Happy (happy/pleased/glad)
  • Sad (sad/glum/aggrieved)
  • Surprised (surprised/amazed/astonished)

From these proportionalities of basic feelings available, a percentage indicator of mental stress can be calculated. When interpreting the information thus generated, it is important to note that psychological stress can also be pleasing and desirable – for example, great joy or states of happiness – and must therefore be differentiated and evaluated in context.
As already mentioned, there are many methods and approaches. Artificial intelligence helps derive these emotions. In our approach we have adopted the concept of using a conventional webcam to detect the face (eye movement, muscle position…) using the FACS (Facial Action Coding System) method, as it is a non-invasive method that does not require prior personalized calibration. Other approaches are voice analysis (highly applicable in call centres) or sentiment analysis of texts.
In order to place these measurements in an overall (physical) context, we have decided, where feasible in the scenario, to collect some physical stress parameters in parallel (blood volume pressure, pulse variance, skin conductance) with a high-quality smart band (Empatica E4) in order to gain an impression of the physical stress, which is always to be considered negative.
In the simplest form of application, such a measurement can serve as direct support in software testing (for example within the scope of user tests). Real time insights can be drawn here that would not be accessible otherwise. No matter what kind of testing or test design is used – ultimately, the human psyche and perception (of the test subjects) always stands in the way. If one (in the scope of a given scenario) has to fulfil a task and finally manages it successfully, one tends to evaluate the necessary steps (and IT systems) more favourably and leniently. Short-term memory is also influenced in this way and the unpleasant phases tend to fade out in such cases. Following a success, we quickly forget that there was perhaps a point in the test where we were unhappy and we will not mention it in subsequent surveys. In the event of a failure of the task at hand, the same behaviour naturally occurs – only with reverse characteristics.
Until now, there has only been the laborious option of observing (or filming) subjects during the test and drawing subjective conclusions from their behaviour. Unfortunately, to untrained people, many of the basic emotions are barely discernible on the outside.
With machine-based emotion recognition, it is possible for individuals, but also aggregated, to determine in which application scenarios anger or joy arises or where users feel lost. In connection with a screencast (at which click or step a certain emotional or even physical reaction occurred), it is easily possible to improve the user interface and menu navigation and make it more human.
With the general service product that we develop, it is also possible, with the use of an appropriate measuring station, away from the typical PC workplace, to capture emotions in defined situations and learn from them. Ideally, training and simulation situations are used to conduct the baseline survey. For example, it is possible to find out how people behave in typical operating situations and environments and in this way gain an insight into how work processes are actually experienced in a holistic way. This step is still risk-free, without the threat of impairment and error in real production operations and it allows linking the emotions recorded with tracking of the environment and view – regardless of whether it is about (predetermined) exercise scenarios or the test subject’s field of vision. In principle it is, first and foremost, a matter of understanding how people feel about certain tasks, what influences or even impairs them.
In the medium term, the general modelling of human behaviour in certain processes takes place – for example, how long a state of shock really/subconsciously influences a driver, what effect, for example, bad sleep has on an operator, when monotonous or cyclical processes begin to affect the attention of the employee. In repeatable scenarios that are measured, the significant relevant influencing factors (is this the same for all employees or has, for example, training, length of service, time of day, shift design, age, gender or something completely different, etc.) can be checked for their influence. Experience has shown that this area is still scarcely researched scientifically and there are hardly any models and explanatory representations available showing how emotions arise in processes and how they influence people. With a sufficient statistical basis, models and forecasts can be derived in the longer term. These predictions and ideas of possible alternative courses of action can also be validated in a comparable setting and thus improve safety, for example, by mitigating or avoiding dangerous situations from the outset.
The use of such technologies in real-time environments (e.g. the monitoring of drivers) is currently technologically challenging not so much because of the measurement technology itself, but rather because of the legal framework (certifications for installation, safety regulations, etc.) and it is also not trivial in terms of labour law, since the boundary between control to increase safety and monitoring by an employer is very thin. The ultimate application of emotion recognition is the integration of such technologies into interfaces and surfaces in a productive way. In this way it is possible not only to incorporate the direct interactions of the user (clicks, inputs) into the interaction design, but also, for example, to adapt machine interfaces to the reactions and recognized states of a user and in this way to react more dynamically and responsively. Identifying how a user feels right now is the first step towards cognitive modes of response. The practical applicability is strongly dependent on the physical setting (possibility of camera positioning, visibility of the face, brightness) – but is driven forward by general technological development (e.g. by the iPhone X). Such interactive interfaces are associated with challenges at the level of data protection, since the basic technologies are also suitable for identifying or recognizing individuals.
This brief outline is intended to provide a small, exemplary insight into how apparently insignificant approaches can be a major lever in digitization and can sustainably change smaller work steps as well as entire business processes.

Author: ÖBB