Expressing the Effectiveness of Planning Horizons https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g April 8, 2011 September 27, 2013

Expressing the Effectiveness of Planning Horizons

John

April 8, 2011

P. TIMMER and J LONG

Ergonomics & HCI Unit, University College London, 26 Bedford Way, London, WC1H 0AP

Expression de l’efficacité des horizons de planification

RÉSUMÉ

L’évolution technologique est souvent motivée par des ‘problèmes’. Pourtant, l’expression de ces problèmes en termes de performance des systèmes de travail n’est souvent qu’anecdotique ou implicite. Cette recherche propose une méthode explicite pour exprimer l’efficacité d’un système de travail. La méthode est illustrée sur un système de travail de gestion du transport. L’intérêt particulier de ce domaine concerne la façon dont l’interaction opérateur-technologie soutient efficacement la planification à l’avance (sous la forme d’un horizon de planification). La méthode est composée de quatre étapes. Premièrement, le comportement de planification à l’avance est conceptualisé. Un aspect critique de la méthode est la ‘Théorie des Horizons de Planification de l’Opérateur’ ainsi que ‘l’extension’ et ‘l’opportunité’ d’horizons de planification particuliers. Deuxièmement, le domaine de travail est modélisé, afin d’établir la qualité du travail effectué par le système de travail. Troisièmement, les comportements qui soutiennent la planification efficace sont modélisés. Finalement, une comparaison est faite entre la qualité réelle du travail effectué et la qualité désirée. Si la performance tombe en-dessous d’un niveau désiré, les comportements du système de travail contribuant à l’inefficacité sont analysés. Si une planification inefficace est identifiée (c’est-à-dire un problème), la méthode soutient la recherche des origines du problème, ainsi que la construction d’une théorie causale. Bien que l’illustration ne porte que sur la planification d’un système de travail de gestion du transport, les étapes de la méthode sont proposées pour soutenir plus généralement l’expression de l’efficacité des systèmes ou autres.

Mots-clés: Horizon de Planification; Efficacité d’un système de travail; Contrôle du trafic aérien.

I. INTRODUCTION

I. 1. MOTIVATION AND OVERVIEW

This paper describes a method that enables the expression of: a) the plans of a process operator, and how far into the future those plans extend; and b) an assessment of how adequate those plans are, for ensuring that work goals are attained. The method is illustrated using an Air Traffic Management microworld. The need to express operator plans, and their extension, arises when technologies are being developed to support the planning of interventions with a dynamic evolving process; domains where process evolution needs to be anticipated, and process interventions need to be planned to address anticipated process states. By associating each individual plan with an assessment of adequacy, design problems may be characterised, problems that may be alleviated by technological support. Where the expression of plans shows those plans to be inadequate for attaining management goals, then new technologies can be proposed that may result in more effective operator planning behaviour.

In general, the evolution of a human-technology worksystem may either be problem-driven or technology-driven (Woods & Roth, 1988). Problem-driven evolution arises when a specific problem is attributed to the design of a technology, and the technology is then redesigned (or replaced) to remedy that specific problem. Such problems are frequently expressed by operators in the form of a subjective, experience-based report or anecdote, which may or may not lead to some in-depth analysis of the problem. Alternatively, technology-driven evolution may arise when a technology is redeveloped (or replaced) simply because redevelopment (or replacement) is possible, that is to say, not in the light of a specific ‘problem’ (such as inadequate planning). As the aim of the method proposed here is to relate operator planning (and plans) to effective or ineffective performance (intervention outcomes), and given that ineffective performance is considered a problem in need of a (technological) solution, the method may be understood as a system development tool, to support problem-driven evolution at the early stage of ‘problem formulation’ (Rasmussen, 1992; Woods & Roth, 1988). The method therefore assists in the process of progressing from operator anecdotes about problems, to a structured and more formal analysis and expression of those problems. The need to express planning problems, and thereby evolve technological solutions, arises in traditional process control domains such as Air Traffic Management (ATM), Railway Signal Management (RSM), nuclear power generation, and so forth. Throughout the paper, method application will be illustrated with reference to an ATM-like microworld of sufficient complexity to demonstrate the phenomena of concern to the method.

Before continuing, a definition of the term ‘design problem’ is offered. A design problem is considered to exist, and therefore acts as a motivator for problem-driven evolution, when a desired level of performance is not being achieved by a human-technology worksystem. That worksystem may then be termed ‘ineffective’, as desired performance is somehow compromised. Where a design problem is believed to exist, a method is needed for expressing that problem (its severity, frequency, behavioural causes, etc.) in a manner that will contribute towards its solution. Outlining such a method is the aim of this paper. Being able to express worksystem ineffectiveness is particularly valuable when problem-driven evolution occurs within safety critical domains. For domains where the consequences of ineffective human-technology interaction have serious potential outcomes for human life, being able to express whether or not particular interactions are effective, supports reasoning about the adequacy of the technology in question. Within the context of Cognitive Engineering, such expression and reasoning is termed ‘diagnosis’ (Dowell & Long, 1998; Rasmussen, 1986) and, as shall be illustrated in the paper, can support formulation of the design problem that a re-designed technology should solve (Dowell, 1998).

The emphasis on ‘design’ and ‘design problems (and solutions)’ characterises the present approach as one of engineering, that is, contributing to the design of effective worksystems (Amalberti & Deblon, 1992; Dowell & Long, 1998; Flach, 1998; Hollnagel, 1998; Rasmussen, 1986; Reason, 1998; Vincente, 1998; Woods, 1998), rather than as one of science, that is, understanding the phenomena associated with worksystems and their behaviours (Barnard, 1991; Meyer & Kieras, 1999). Within the design approach, the present can be more precisely characterised as ‘design for effectiveness’, seeking to use the design primitives of ‘work’, ‘worksystem’, and ‘performance’ to motivate the acquisition and validation of design knowledge, to diagnose and solve design problems (in contrast, for example, to ‘human performance’, expressed as some form of speed and errors (Reason, 1998)).

Problems of interaction arise with many different types of technology. Here, particular consideration is given to problems arising with technologies that support planning ahead. In presenting the method, at the first stage, a theory is presented, for use when modelling how far ahead an operator plans. In contrast to other work on planning (Amalberti & Deblon, 1992; Boudes & Cellier, 1997; O’Hara & Payne, 1999), this theory makes reference to a plan’s ‘extension’, and its ‘adequacy’ (how well the plan ensures work goals are achieved). A method for expressing the effectiveness (or otherwise) of a plan necessarily requires consideration of the plan’s extension (how far into the future the plan accounts for an intervention), and whether or not the plan is adequate to ensure goals are attained.

The aim of this paper is therefore to propose a method for expressing the adequacy of operator planning, with special interest in capturing instances of ineffective planning (diagnosing design problems). The presented method comprises four stages. In the first stage, planning behaviours of interest are conceptualised by a domain-independent Theory of the Operator Planning Horizon, and requirements for modelling an operator’s planning horizon are generated from that theory. During the second stage, the work carried out by the human-technology (planning) worksystem is considered, and captured by a model of the domain. Here, the ATM-like microworld is described. In the third stage, models of operator planning horizons are constructed. Finally, these two types of model, of the domain (work) and of planning horizons (plans and planning behaviour), are considered alongside each another, and diagnostic assessments are made as to whether or not the plans formed were effective (i.e., whether or not there is a design problem concerning planning). When a problem is identified, a causal theory is constructed to relate how operator-technology interactions contributed to the occurrence of that problem. The four stages together constitute a method that helps to establish an explicit link between planning behaviours and the quality (effectiveness) of work executed by the worksystem, a relation frequently addressed in only an implicit fashion (Boudes & Cellier, 1997). The method therefore addresses the construction and use of models during the design process, a tradition well established within HCI research (Blandford & Young (1993); Card, Moran & Newell (1983)). In the analysis of operator-technology interactions, ‘effectiveness’ is considered a primitive (ontological) entity, alongside: human (planning) behaviour; technological behaviour (and how it supports planning); and details of the work performed (Dowell & Long, 1998). The paper is structured to reflect the stages of the method outlined above, and concludes with a discussion of the method.

I.2 ATM-LIKE MICROWORLD

For the purposes of method illustration, a laboratory-based ATM-like microworld was used. The microworld was constructed on the basis of an observational field study at Ringway Control Centre in Manchester, and possesses selected characteristics of the operational system that make it suitable for illustrating the method (Dowell, 1998; Long & Timmer, 2001). The domain is dynamic and imposes a significant planning burden upon the operator, who must anticipate the future state of air traffic, establish goals, collect and integrate data from different sources, plan tactical interventions to two aircraft variables (altitude and speed), and finally intervene with aircraft. Aircraft response to worksystem intervention is time-lagged, and the quality of aircraft management, in terms of aircraft safety and expedition (fuel use, progress to plan, minimum interventions), is calculated by the simulation software, following an interactive scenario.

The managed domain, containing aircraft, beacons, airways and so forth (Dowell, 1998), is generated by the simulation software and displayed on a computer-based radar. Paper-based flight progress strips are used, as observed in the operational worksystem, and document aircraft entry and exit states, aircraft identity and route. The strips are annotated after each intervention with details of aircraft state changes. Interventions to aircraft altitude and speed can be made via menu selection on the radar, rather than by ground-to-air radio. Through interaction with these technologies, the operator’s task is to ensure the safe and expeditious management of aircraft across the sector. Aircraft traverse a sector along airways, moving from an entry beacon to an exit beacon, via an intermediate beacon (see Figure 1). Between eight and ten aircraft are managed during a scenario. Operators are naïve subjects, trained in the management procedures necessary to ensure aircraft safety and expedition, and with some practice in the management task.

Figure 1. Topographic representation of the simulated sector

Figure 1 Représentation topographique du secteur simulé

While the simulation constitutes a considerable simplification of operational ATM, it is sufficient for method illustration. Planning can be observed, of interventions to (process) object variables, and is available for analysis. Likewise, desired levels of worksystem performance can be specified, actual levels of performance measured, and effectiveness or ineffectiveness thereby expressed. Providing these criteria are met, it is proposed that the method may be applied to other domains (operational or microworld). Having considered the microworld of interest, the method is presented in four stages. First, consideration is given to conceptualising the operator planning horizon. Second, the work carried out by the microworld operator is considered (Stage 2), after which models of planning horizons are constructed (Stage 3). Finally, ineffectiveness is diagnosed (Stage 4).

II. THEORY OF THE PLANNING HORIZON

II.1 PLANNING SCOPE

The term ‘planning’, within the Cognitive Ergonomics / Science / Psychology literature is used to refer to a wide range of human operator behaviours: problem-solving (O’Hara & Payne, 1999); task scheduling (Hayes-Roth & Hayes-Roth, 1979); decision-making (Pietras & Coury, 1994); or anticipation (Boudes & Cellier, 1997, 1998). Additionally, planning can refer to an immediate action (Shallice, 1982), or an action after some temporal delay (Amalberti & Deblon, 1992). In consequence, some clarification of the scope of the term, as used here, is required. Planning refers to the formation of plans for actions/interventions with aircraft at some point in the future. An important distinction here, therefore, is between pre-planned interventions and unplanned interventions (referred to in this paper as management by ‘instant execution’). Instant execution occurs where an intervention is specified, and then immediately executed by the operator. As such, the term refers to the formation of real-time control decisions, with no forward-looking time delay between the specification of an intervention and its execution. In contrast, planning is always for some future intervention (ahead), that is, not the next action undertaken by the operator, after specifying the intervention. The planning ahead of interventions is commonly: (i) viewed as more efficient (i.e., involving fewer interventions) than instant execution, when implementation costs are high (O’Hara & Payne, 1999); (ii) associated with expertise and strategic thinking (de Groot, 1978); and therefore (iii) considered a form of best practice. For example, de Groot showed that grandmasters plan ahead to a depth of six or seven chess interventions, in contrast to novices, who may only consider one or two interventions ahead. The benefits of such ability are clearly demonstrated by game outcome; the grandmaster is a ‘master’ for good reason.

The development of technology, for the management of dynamic processes, reflects the general desirability of planning interventions further into the future. The ATM-like microworld, used in this paper, serves only to illustrate the method. However, within operational Air Traffic Management in recent years, the concept of ‘gate-to-gate’ aircraft route planning has emerged. This concept requires that the planning of aircraft interventions no longer be devolved to the level of sector management, but rather be considered for all sectors traversed by aircraft during flight, by a separate team of specialist ‘multi-sector planners’. Again, this strategy suggests that planning further ahead is considered generally to be a desirable feature of dynamic process management, a feature for which technological support is sought (David, 1997; Miaillier, 1998).

II.2 PLANNING THEORIES

Models of human and machine planning have a long history (Hoc, 1988; Miller, Galanter, & Pribram, 1960; Sacerdoti, 1977), such models drawing to a greater or lesser extent, on planning theory (Suchman, 1993, 1987; Vera & Simon, 1993). For the purpose of expressing planning effectiveness, at Stage Three of the method, a model is constructed that represents an operator’s plans for future interventions with aircraft. Model construction is informed by a Theory of the Operator Planning Horizon. The theory is largely a synthesis of parts of other planning theories, with some additions to make it suitable for expressing planning effectiveness. Such theory-driven modelling is possible because the simulation presents the operator with a well-defined problem space, and the task is completed by operator-technology interaction. There are no human-to-human collaborative processes influencing the planning task (Hughes, Somerville, Bentley, & Randall, 1993). As such, the method addresses planning as work at the level of micro-level mental processes, rather than as a macro-level community activity (Engeström, 2000). There is only one agent of management, who plans, anticipates, intervenes, and so forth.

From planning theory, a number of stable phenomena have been documented, of both planning behaviour, and of plans themselves, that may be assumed when modelling operator planning for dynamic process management. For example, planning is most frequently characterised as being a form of goal-oriented behaviour (Newell & Simon, 1972). Having expressed the goals of management within the microworld as the maintenance of aircraft safety and expedition, it is anticipated that individual plans for interventions (to speed and altitude of individual aircraft) can be associated with these management goals.

During the planning process, a number of mental representations (knowledge sources) are known to be utilised, mental representations of (1) the domain state (Moray, 1992; Rasmussen, 1986), and of (2) how to interact with technologies/devices (Young, 1983) to bring about transformation of that domain (Payne, Squibb, & Howes, 1990). A mental representation of the state of the domain corresponds to what the operator knows, from moment to moment, about the state of the managed traffic – the work being undertaken (Dowell & Long, 1998). This representation is frequently referred to as the operator’s ‘picture’ (Cox, 1992; Whitfield & Jackson, 1982), and it is used to predict the future states of aircraft on the sector, and thereby plan.

Different types of plan are known to exist (Hoc, 1993), two possible types being plans for: (1) high-level process management; and (2) particular interventions to particular process objects. Here, the plans of concern are scoped to the latter class, plans for interventions with simulated aircraft (worksystem actions that transform the domain). In addition to mental representations of the domain (used in planning), the operator requires representations concerning how to interact with worksystem devices (Young, 1983), so that the detailed specification of actions, which bring about interventions, can be constructed.

Hayes-Roth and Hayes-Roth (1979) have shown that human planning is opportunistic, in that individuals plan when opportunity and necessity demand. Operators may not maintain a complete, coherent and well integrated set of plans for all aircraft managed within the sector. Rather, plans may exist for some aircraft, known to be particularly problematic (top-down planning), but as domain events arise (or are predicted to arise) that affect management goals, plans may be constructed in a reactive manner (bottom-up). It is therefore not the case that what is planned (at ‘planning time’) is always executed (at specified ‘execution time’). Rather a range of outcomes may be observed as plans may be discarded or repaired (Hoc 1988; Woods, 1988), due to failures in information acquisition and subsequent anticipation, or decay (i.e., be forgotten (Timmer & Long, 2000)). Having considered some of the relevant planning literature, for the management of dynamic processes, the Theory of the Operator Planning Horizon can now be presented.

II.3 THEORY OF THE OPERATOR PLANNING HORIZON

The first stage of the method, for expressing the effectiveness of a planning horizon, involves conceptualising the planning phenomena of interest. For this microworld, conceptualisation involves scoping operator planning behaviour, observed in the simulation, with respect to the planning literature, and synthesising existing theory with concepts necessary to express the effectiveness of a planning horizon. The outcome here is the Theory of the Operator Planning Horizon (TOPH), which makes explicit the planning phenomena of concern to the research. The TOPH may be stated, in domain-independent terms, as follows:

• An interactive (planning) worksystem formulates plans for interventions to a domain that are intended to attain management goals.

• Plans are formed by planning behaviour that requires mental representations of: i) the domain; and ii) the devices (and how to interact with those devices, to bring about interventions).

• Plans specify: i) interventions to domain object attribute values; and ii) a ‘triggering condition’ for plan execution.

• One or more plans, that refer to the same domain object, constitute the planning horizon for that object.

• A planning horizon may be described in terms of its ‘extension’.

• A planning horizon’s extension is expressed in terms of the future state of the domain object in question, if all planned interventions are executed.

• The extension of a planning horizon may be described as being adequate or inadequate for ensuring that the goals of management are met. The adequacy of a planning horizon’s extension is determined by the individual plans it contains, that is to say, whether or not when implemented, those plans ensure management goals are attained. If an operator’s planning horizon extension is adequate, and assuming the planned interventions are executed, the horizon will support effective management.

The theory places emphasis upon the planning worksystem’s goals, mental representations, and plan details. The concept of an horizon’s ‘extension’ arises from the need to express how far into the future plans (within an horizon) account for the state of a managed object. For example, two planned interventions to an aircraft (in the microworld) may be sufficient to ensure that the aircraft leaves the sector in its planned ‘exit’ state. The horizon (comprising those two planned interventions) may then be said to ‘extend to’ that aircraft’s exit state (object goal state, at the time of planning). Alternatively, horizons for aircraft may extend to states, associated with particular beacons, or parts of airways. Expressing the planning horizon’s extension in terms of the state of an object may be contrasted with other theories, such as Boudes and Cellier’s (1997) Theory of Anticipation Range. In their theory, the extension of a set of plans is expressed in terms of time, for example, plans that extend over the next 5 minutes for a given aircraft (see also Anderson & Settle, 1996). Finally, given the present work’s focus upon expressing problems with plans, the notion of planning extension adequacy is critical and novel. The adequacy of an extension is a concept that is used to relate plans for particular interventions to management goals, and thereby enable the expression of whether or not the specified plans for an aircraft ensure safety and expedition for that aircraft, over the extension of the plans. In conclusion, while the theory embodies some familiar concepts from planning theories (e.g., use of mental representations in planning, existence of triggering conditions, and so forth), it also possesses some novel concepts that are important for the expression of planning effectiveness.

Use of the term ‘horizon’, to refer to the limits of a mental behaviour, is not new. Hutchins (1990) describes an ‘horizon of observation’ in ship navigation, and more recently Wong, Sallis and O’Hare (1997) have discussed the planning horizon with respect to ambulance dispatch, yet in a non-technical manner, without defining what is meant by the term. Hence the need to conceptualise the term, and thereby generate a theory. Perhaps the most complete and explicit exposition of an alternative theory to TOPH, that examines the horizon of some form of mental behaviour, is Boudes and Cellier’s (1997, 1998) Theory of Anticipation Range, which possesses two components: (1) a mechanism for anticipating future domain object states; and (2) a temporal horizon. In contrast, the TOPH possesses three components: (1) planning behaviour (including plans); (2) horizon extension (in terms of future object states, rather than time); and (3) the adequacy of horizon extension (explicitly addressed).

II.4 REQUIREMENTS FOR A MODEL OF THE PLANNING HORIZON

The TOPH conceptualises the planning phenomena of interest, for the modelling of an operator’s horizon. From the theory, requirements may be generated, concerning behaviours that a model of a planning horizon should capture (at Stage 3), if the model is to represent accurately an horizon. These requirements are:

• Models need to reference interventions with objects (aircraft) that can be associated with management goals (of safety and expedition).

• Models need to reference the operator’s mental representation of the state of the domain at the time of plan formation.

• Models need to specify the details of planned interventions, and the conditions for triggering such interventions.

• Planning horizon models need to be specified for each managed object, and moment-to-moment changes of horizon extension need to be expressed in terms of a change in state of the object to which reference is made – domain objects here may be abstractions, i.e. groups of functionally related objects (aircraft).

• Models need to specify all the interventions that actually take place with an object (planned and unplanned), and their impact upon management effectiveness, such that the adequacy of an horizon extension may be expressed retrospectively.

With respect to the microworld, it can be seen that following the TOPH, a model needs to represent: (1) operator plans for a particular aircraft (one aircraft per horizon); (2) the operator’s mental representation of the state of the domain at the time of planning; and (3) the interventions actually carried out. Such a model addresses the planning concern of this research. To examine the effectiveness of the plans formed, a model of the planning horizon needs to be considered alongside a second model, a model that captures the quality of work carried out (i.e. the extent to which the aircraft objects were actually managed in accordance with management goals). This second model of work quality is termed a domain model, and the focus for Stage 2 of the method. In the next section, a microworld-generated domain model is considered.

III. DOMAIN MODEL

In the microworld, a computer-based simulation generates a radar image of the managed sector, updated as aircraft traverse the sector, or change speed/altitude in response to operator interventions. In addition to generating this image, after each operator intervention, the simulation calculates new domain model values for a hierarchy of aircraft attributes that may have been altered as a consequence of the intervention. At the lowest level of the hierarchy are what Dowell (1998) calls PASHT attributes, standing for: Position; Altitude; Speed; Heading; and Time. From these low level attributes, intermediate attributes are calculated for aircraft: Progress (flight duration); Fuel Use; Separation and Number of Manœuvres. Finally, at the apex of the attribute hierarchy, values for aircraft safety and expedition are calculated from intermediate attribute values. Therefore, an intervention to change an aircraft altitude will be recorded in the domain model at the PASHT level, and consequences of that aircraft climbing/descending to the new altitude will be calculated in terms of progress and fuel use, and ultimately any consequences for aircraft safety and expedition at the new altitude (and during the ascent or descent) will be calculated. Following the last intervention with each aircraft, the set of attribute values calculated represent the final ‘actual’ values for: Progress, Fuel Use, and so forth; for that aircraft’s passage across the sector. In addition to calculating these values after each intervention, for each attribute the model possesses a goal value. These goal values are calculated, by the simulation software, based upon an optimal (goal) path across the sector. Therefore, given ‘actual’ attribute values (from management scenarios) and ‘goal’ values (from an optimal scenario), comparisons may be made, from intervention to intervention, between the actual and goal states of managed aircraft. Large discrepancies between actual and goal values will constitute the starting point for diagnosing worksystem design problems.

Figure 2. Part of a domain model for a single intervention with aircraft BAN

Figure 2. Partie du modèle du domaine pour une intervention unique sur l’avion BAN

Figure 2 shows part of a domain model for aircraft BAN, following an operator intervention to slow down BAN from 900Kmph to 720Kmph (the change to BAN’s PASHT attribute – Speed – is shown at A). At B and C, only Intermediate attribute values are shown, from which values for safety and expedition can be calculated. Bold values in brackets, alongside each attribute name, represent goal values for that attribute. Unbracketted values show calculated predictions for each attribute, given the aircraft’s state, following the most recent intervention. A set of such values, for all aircraft after each intervention, constitutes the domain model of interest. Therefore, as a consequence of an intervention to reduce BAN’s speed, its predicted Progress across the sector is slowed, from 1170secs to 2220secs; and Fuel Use is greatly reduced, as 720Kmph is the cruising speed for all aircraft (the speed at which fuel use is optimised). Within the model the Separation attribute shows the aircraft is safe, no separation conflicts with other aircraft have been predicted, given the new speed value. As a consequence of the intervention, the Number of Manœuvres attribute is increased by one.

When such data concerning the actual state of an aircraft are compared with goal values for that aircraft, an analysis of the quality of aircraft management is possible, both from moment to moment, and at the end of the management session. The data for BAN here show that prior to the speed intervention, BAN’s time to progress across the sector was 1170sec. When compared to BAN’s goal value for progress, this value indicates BAN to have been progressing too quickly across the sector, and due to exit the sector 1260secs earlier than planned (the difference between actual and goal values). The cost associated with such fast passage is clear from the fuel consumption figures, which show the aircraft, prior to intervention, consuming 328 more units of fuel than the goal value. Both before and after the intervention, BAN’s separation value is safe, and so BAN’s superordinate safety attribute can likewise be calculated as being safe – BAN is in a safe state, both before and after the intervention. Following the intervention, BAN’s progress and fuel consumption are more closely aligned with goal figures.

To establish instances of ineffective management, some criteria need to be applied to the size of any discrepancy between goal and actual attribute values. Here, it is assumed, for the purpose of expressing effectiveness, that if values for an aircraft’s progress or fuel consumption exceed 10% (either side) of the goal value, ineffective management has occurred. Likewise, if aircraft separation is violated, ineffectiveness has occurred. Finally, the Number of Manœuvres value should not exceed 3 (i.e. 50% in excess of the goal value shown). Consideration will now be given to the impact these criteria have (for intermediate attribute values), on high level attributes of safety and expedition. If separation is violated, the attribute value for safety likewise reflects this state (Safety = Unsafe). If any one of the values for: Progress; Fuel Use; or Number of Manœuvres exceed specified criteria, the aircraft may likewise be considered ‘unexpeditious’, i.e. the management goal of expedition is not being attained.

In conclusion, data from the microworld-generated domain model concern the quality of aircraft management, and instances of attribute values violating criteria may be associated with instances of ineffective management – unsafe aircraft, aircraft progressing too fast or slow, consuming too much or too little fuel, or undergoing intervention too often. Expressing the effectiveness of an operator’s planning horizon, therefore, involves associating domain model data with particular interventions, and establishing whether or not those interventions were planned. In the next section, Stage 3 of the method for expressing the effectiveness of planning horizons is presented. A second model, a model of an operator’s planning horizon, is discussed. Stage 3 is followed in Stage 4, by an analysis of the effectiveness of that horizon, when the planning horizon model is considered in the light of data from the domain model (Stage 2), which contains data similar to that discussed above.

IV. Model of a Planning Horizon

IV.1 Data Requirements for Model Construction

From the Theory of the Operator Horizon, a number of requirements were identified for a model of the horizon. Each requirement is now considered in turn, and the means for obtaining data to address the requirement discussed.

Requirements

• Models need to reference interventions with objects (aircraft) that can be associated with management goals (of safety and expedition).

In the microworld, all interventions are made with the radar device, and are therefore observable. Continuous observation of operator interaction with worksystem technologies yields such data.

• Models need to reference the operator’s mental representation of the state of the domain at the time of plan formation.

Operators establish the state of the domain by referencing worksystem devices. By observing head movements and pointing actions to fields on a Flight Progress Strip, it is possible to establish what data concerning the domain are being acquired by the operator. Traditionally, concurrent verbal protocols provide a rich source of data concerning what an operator knows about a problem, and how that knowledge is used in problem solving. Verbal protocol data, in addition to observation of operator head and hand movement, yields data that assist in inferring the operator’s changing mental representation of the domain. In addition, the mental representation can be inferred from interventions. For example, if an operator intervenes with an aircraft to give it a cruising speed, it is possible to infer that the operator knew that aircraft’s speed was not the desired cruising speed, prior to intervention.

• Models need to specify the details of planned interventions, and the conditions for triggering such interventions.

In the absence of planning tools that support the explicit documentation of a set of plans, operator plans for interventions remain in the head of the operator. In consequence, only verbal protocol data can reveal the details of such plans, and associated triggering conditions.

Each model should reference an individual aircraft. As the horizon changes, the state of the aircraft in question can be established with reference to the computer-generated domain model for that scenario (see Section III).

• Models need to specify all interventions that actually take place with an object (planned and unplanned), and their impact upon management effectiveness, such that the adequacy of an horizon extension may be expressed retrospectively.

To satisfy the first requirement, all interventions are documented. The impact of each intervention, upon management effectiveness (how well the actual state of the aircraft matches its goal state), can be established with reference to the domain model generated for the management scenario.

Therefore, with regard to acquiring data for the purpose of modelling operator planning horizons, observational data, of operator hand and head movements are required (for information search and interventions), plus concurrent verbal protocol data (for plans and evidence of the content of the associated mental representation). With such data, a model of the operator’s planning horizon can be constructed, as in Figure 3, which separates each of these classes of data.

IV.2 MODEL OF AN OPERATOR PLANNING HORIZON

Figure 3 presents a model of an operator’s planning horizon for an aircraft LOG. The ‘Plan/Execution’ column of the model records all human-technology worksystem goals that concern interventions. Data in this column distinguish goals for immediate execution (in bold), from plans for future interventions (in italics). The ‘Intervention’ column documents those goals for interventions that were actually implemented.

Figure 3. Model of an operator’s planning horizon for aircraft LOG

Figure 3. Modèle pour un opérateur d’un horizon de planification pour l’avion LOG

Goals that concern interventions are formed, given a particular mental representation of the state of the aircraft in question, at the time of planning or instant execution. The model records such mental representations in the ‘category’ column. The set of possible categories reflect the range of possible states of the objects in the domain (Timmer & Long, 1997). Aircraft may therefore be ‘Active’, when they arrive on the sector, or ‘Incoming’, prior to arrival. Once active, they may be ‘Safe’ or ‘Unsafe’, ‘Expeditious’ or ‘Unexpeditious’ with respect to their speed (‘Unexpeditious (Speed)’) or altitude (‘Unexpeditious (Altitude)’). Once an aircraft is at its exit altitude and cruising speed, no further interventions are required as it is in its exit/goal state (‘Active Aircraft (Exit)’). The final ‘Encode’ column records device information fields that were searched by the operator as a means to forming a mental representation (category) of the state of an aircraft, for the purpose of goal/plan specification or monitoring.

The model for aircraft LOG is considered in detail. The model commences (Line 1) by showing that at the beginning of the management session, having encoded LOG’s Flight Progress Strip (FPS), the operator knows that LOG is an incoming aircraft. No plans are formed at this stage. LOG then enters the sector (Line 2), and by encoding LOG’s radar trace, the operator’s mental representation of the state of LOG is updated, from ‘Incoming Aircraft’ to ‘Active Aircraft’. Referencing again LOG’s FPS (Line 3), the operator establishes LOG as travelling at 900Kmph, and at an altitude of 13,000ft. Given LOG’s excessive speed, the aircraft is mentally represented as being unexpeditious with respect to its speed, and an intervention is immediately formed (Line 3) and executed (Line 4) to slow the aircraft down, and transform its state to ‘expeditious’ with respect to its speed. This intervention is an example of instant execution, and represented within the model as such. One consequence of the intervention is that the operator’s mental representation of LOG is now updated, to reflect its new expeditious state. LOG is safe and expeditious, resulting in the operator forming a default plan, to leave LOG in its current state for the foreseeable future. The model then shows that LOG is left to progress from its entry beacon to the intermediate beacon ‘Delta’, and then on to its exit beacon ‘Epsilon’, at altitude 13,000ft and speed 720Kmph. Once at its exit beacon (Line 5), the operator encodes FPSs for both LOG, and a second aircraft SAM, and realises that both aircraft occupy the same altitude, and safety conflict is possible. LOG is therefore mentally re-categorised as ‘Unsafe’, and a plan is formed to give LOG its exit altitude of 4,000ft later (in the near future). The model then shows (Line 6) that the operator’s mental representation of: (1) the state of LOG; and (2) the plan for future intervention (formed at Line 5), are both forgotten (decay) and the plan to ‘Give LOG (its exit) altitude later’ is re-formed (at Line 7). On the occasion of this re-planning, the model shows that no safety conflict with SAM was identified, and LOG was represented once more as a ‘Safe Expeditious Aircraft’. The next line of the model (Line 8) shows that the operator appears to have forgotten (again) the state of LOG, and by referencing data from FPSs (Line 9), reforms the mental representation that LOG is a safe expeditious aircraft. The plan formed at Line 7 is not assumed to have decayed on this occasion, as the model shows no further re-planning of the intervention, until it is executed at Line 10. The final line of the model shows that LOG is given its exit altitude as planned, and the default plan to leave LOG until it exits the sector is formed, as the aircraft is in its exit state, and no further interventions are necessary.

If the model is considered with respect to how it documents the extension of the operator’s planning horizon for LOG, the following observations can be made. At Line 4, a plan is formed to ‘Leave LOG’. Given that LOG is flying at a high altitude, and at cruising speed, and is safe (i.e. is an ‘Active Safe Expeditious Aircraft’), LOG may be left in such a state, until near its exit beacon, and then it should be given its exit altitude. The plan to ‘leave’ LOG may therefore be said to extend to ‘near LOG’s exit beacon’. At Line 5, when LOG is near its exit beacon, a further plan is formed to give LOG its exit altitude. At Line 5, with such a plan, the operator’s planning horizon may be described as extending to ‘LOG’s exit state’ (i.e. to the end of its management). At Line 7, the new duplicate plan (of the plan formed at Line 5) has an identical extension to that of the plan at Line 5. Therefore, with a model of an operator’s planning horizon, from moment to moment, it is possible to express the ‘extension’ of the plans formed, either in terms of: a) the state of the aircraft in question, for example, the plans extend to LOG’s exit state; or b) some position on the sector at which the aircraft’s state must change, for example, plans for aircraft LOG extend to near LOG’s exit beacon’. A description of the extension of a planning horizon, therefore, arises from the details of the operator’s plans for a given domain object (aircraft) at a given moment. As plan details change, so too does the horizon’s extension.

V. EXPRESSING HORIZON EFFECTIVENESS

Having conceptualised the planning horizon (Stage 1), described how a domain model captures work quality (Stage 2), and illustrated how a planning horizon is modelled (Stage 3), in Stage 4 of the method, expression of the effectiveness of that horizon is undertaken. In the first instance, the model of the planning horizon is considered, line-by-line, alongside objective data from the domain model, concerning aircraft states after each intervention. The actual state of aircraft (from the domain model) can then be compared with the operator’s inferred mental representation of the state of that aircraft (‘Category’ column of a planning horizon model), and plans and interventions considered, to establish whether or not the plans/interventions were effective. Once completed, and in the event of an instance of ineffectiveness occurring, some expression of its cause can be undertaken.

The Theory of the Operator Planning Horizon expresses the concept of effectiveness (attaining management goals), in terms of the adequacy of an horizon’s extension. Adequacy of an horizon’s extension relates to the quality of work that will be brought about by the worksystem, if the plans that make-up the planning horizon are executed. An horizon’s extension may be considered adequate, if the plans (that make-up the horizon), when executed, lead to the attainment of worksystem goals of safety and expedition. Adequacy is, therefore, a difficult attribute of an horizon to assess. At one moment the horizon may extend to an Aircraft’s (Aircraft X) exit state, and be adequate to ensure goals are met. At another moment, a second Aircraft Y may be given the same altitude as Aircraft X, thereby rendering the plans, that make-up Aircraft X’s planning horizon extension, inadequate for ensuring the maintenance of safety. Re-planning is then necessary. Existing plans may need to be discarded, given the new unsafe state of the aircraft. For each line of the model in Figure 3 (starting at Line 4), an assessment of adequacy is made, following a short description of the plan being assessed.

• Line 4

Encode Intervention Category Plan/Execution

LOG to 720

Active Safe Expeditious Aircraft

description

LOG is given speed 720Kmph (instant execution), and a plan is formed to ‘Leave LOG’ at speed 720Kmph, and altitude 13,000ft (i.e. as an Active Safe Expeditious Aircraft). A single plan is formed that extends to near LOG’s exit beacon.

adequacy

Following the intervention to LOG’s speed, at the time of plan formation, the domain model’s prediction of LOG’s state is presented in Figure 4.

Figure 4. Domain model performance data for LOG

Figure 4. Données du modèle de performance du domaine pour LOG

The domain model shows that LOG is:

• safe in its new state

The planning horizon at Line 4 is made-up of a single plan, to leave LOG, and given that LOG is safe in its current state, it may be said that the horizon extension, at this point, is adequate for ensuring that aircraft safety is maintained.

• expeditious in its new state

Projected progress is within 10% of criterion. Projected fuel use is 18.8% in excess of criterion. Given LOG’s speed and high altitude, it would seem the projection of excessive fuel use refers to fuel already consumed, before the intervention, i.e. when travelling at 900Kmph earlier in the management scenario (see Line 3, Figure 3). The plan to leave LOG near its exit beacon is therefore adequate for ensuring aircraft expedition.

• Line 5

Encode Intervention Category Plan/Execution

Position = ti1Altitude = 130Exit-at = EpsilonExit Altitude = 40

SAM, Altitude = 130

Active UnsafeExpeditious Aircraft

Give Altitude 40, Later

description

LOG progresses across the sector to its exit beacon. A plan is formed to give LOG its exit altitude of 4,000ft, ‘later’. This single plan extends to LOG’s exit state. While ‘later’ is a triggering condition of minimal specification, the assumption that the horizon extends to LOG’s exit state is considered justified, because all necessary interventions have been specified to ensure LOG leaves the sector in its exit state.

adequacy

At planning time, and as a consequence of the adequacy of the planning horizon’s extension at Line 4, LOG is safe and expeditious. This plan is actually executed at Line 10. It is therefore possible to consult the domain model’s assessment of LOG’s safety and expeditiousness after this intervention (Figure 5), and thereby assess the adequacy of the plan extension.

Figure 5. Performance data for LOG after the last intervention

Figure 5. Données de performance de LOG après la dernière intervention

From Figure 5, it is therefore possible to say retrospectively:

• the planning horizon extension (to LOG’s exit state) will ensure its safety

• while progress and exit altitude are as planned, LOG’s fuel

consumption increases with this intervention (40% in excess of the

criterion value). It would therefore appear that this horizon extension is not adequate for ensuring that aircraft expeditiousness is maintained.

• Line 6

Encode Intervention Category Plan/Execution

LAPSE

The details of the plan to ‘Give LOG altitude 40, later’ appear to have decayed, and so the model shows the operator has no plans at this time.

• Line 7:

Encode Intervention Category Plan/Execution

Position = ti1Altitude = 130Exit Altitude = 40Speed = 720

TAW, Position = ti2

TAW, Altitude = 130

TAW, Exit Altitude = 130

TAW, Speed = 720

Active Unsafe

Expeditious Aircraft

Give Altitude 40, Later

description

The intervention is re-planned, as at Line 5.

adequacy

As at Line 5.

No further plans are formed. At Line 10, the plan formed at Line 7 is executed. The domain model, as discussed at Line 5, shows LOG is unexpeditious with respect to its fuel use (40% excess).

From this line-by-line analysis, it is clear that the effective management of aircraft LOG’s fuel consumption did not take place as desired. In consequence, the domain model shows LOG to have been unexpeditious as it exited the sector, having consumed 40% more fuel than desired (the bracketed goal value for fuel use in the domain model). Given this ineffectiveness, the following expression of the problem, with this ATM planning task, is possible:

The problem of managing aircraft expedition with respect to fuel use may originate, as in the case of LOG, with aircraft entering sectors at high speed (greater than cruising speed), thus already rapidly consuming large quantities of fuel. The timely reduction of aircraft speed can alleviate this problem. However it would seem a greater part of the problem arises from judging the appropriate moment to intervene with aircraft, to allocate low exit altitudes. Managing aircraft at cruising speed, as high as possible for as long as possible, is the best strategy for minimising fuel consumption. With a reduction of altitude comes a commensurate increase in fuel consumption. If an aircraft is to leave a sector at a low altitude (e.g., for airport approach), careful judgement is required as to when to execute such an intervention, so that the aircraft exits the sector at the exit altitude. If such an intervention is executed too early, the aircraft will fly for an extended duration at a low altitude, consuming higher quantities of fuel. In the case of LOG, this problem would appear to have been the most important. The operator’s plan to ‘Give LOG Altitude 4,000ft, later’ lacked accurate reference to a position within LOG’s final airway, when the intervention would be made, ‘later’ merely meaning ‘near LOG’s exit beacon’. It is assumed that in consequence, LOG was given its exit altitude too early, and flew too low for too long, to maintain an expeditious level of fuel consumption. While worksystem technologies supported the operator in forming a plan for the correct intervention, they offered no support for judging the most appropriate moment for plan execution.

This expression of the causes of ineffectiveness accounts for ineffectiveness in the management of a single aircraft. It is proposed that for each additional instance of ineffectiveness, attributed to fuel use, a similar analysis be undertaken. When considering the data over a number of instances of the same problem, a general expression of the problem is possible (Timmer, 1999). Nevertheless, from consideration of the expression above, it is proposed that problem-driven technological evolution can benefit from problem expressions, similar to that illustrated. In the case of the evolution of technologies to support better fuel use management, redesign may commence with consideration of how to support operator judgement in timing the moment of low-level aircraft descent, prior to exit.

V. DISCUSSION

In this paper, a method has been proposed and illustrated for expressing human-technology worksystem effectiveness during a planning task. The method involves stages of: conceptualising behaviour of interest; modelling the domain to measure work quality; modelling planning behaviour; and considering work quality alongside worksystem behaviour, to enable an expression of effectiveness. Using an illustration of the poor quality management of expedition, ineffectiveness was identified, and expressed in terms of worksystem behaviour, and in a manner proposed to support problem-driven technological evolution.

Of the method, a number of observations may be made. The method is comprised of a set of general stages, rather than a set of detailed procedures. For example, following conceptualisation, a domain model needs to be constructed. The illustration discusses one such model (not how it was constructed), and it is accepted that there are many alternatives to the one discussed. For the method to express effectiveness successfully, it is merely a requirement that a domain model measure work quality in some way, using some performance criteria for judgement of ineffectiveness (design problems). Without such measurement, comparing worksystem behaviour with work quality during Stage 4 is not possible. One advantage of expressing the method in this manner is that it extends the method’s scope of application. An ATM-like microworld was the focus of the illustration, and planning ahead conceptualised with respect to the planning behaviour likely to be observed during management of that microworld. Using another domain (the domain of Railway Signal Management (RSM) has also been analysed), it is possible to envisage other behaviours being conceptualised, for example, the notion of planning extended to include strategic plans, (for example to maximise train throughput or even out train flow), as well as plans for discrete interventions (tactical) (for example, to particular signals). Provided such planning is modelled accurately at Stage 3, the logic of the method, and successful expression of effectiveness, should be maintained.

The success with which the method can be migrated from a microworld to an operational environment, is largely a function of the extent to which a) the planning horizon modelling requirements can met, and b) the target domain modelled (and performance measured). In current operational ATM, for example, the method, as it stands, will not scale-up, as the verbal protocol data (revealing planning behaviour) can not be obtained without task interference. In the future, with Datalink technology, method application to operational ATM may be more feasible. In RSM, the voice channel is largely free, except for control-to-train communication, and control room communication – here also, management is largely undertaken by a single signalman (with some communication with other managed line sections and a supervisor).

When this work is considered alongside the work of Boudes and Cellier (1997, 1998), concerning controller ‘anticipation range’ in ATM, some similarities and differences can be identified. Theoretically, Boudes and Cellier’s work has strong similarities, in that they too seek to establish relationships between operator mental representations of the domain and devices, and plans for interventions and flight strip management. However, the method presented here, for modelling the effectiveness of planning horizons, possesses a number of novel components. Firstly, it attempts to characterise the extension of a planning horizon in non-temporal terms, but rather in terms of the future states of intervened aircraft (whether or not aircraft will be safe or unsafe, and to what future point (on the sector) do such plans extend). Secondly, the method possesses a domain model, with which to assess the adequacy of plans for future interventions. The domain model is crucial to the successful execution of the method, as it enables an assessment to be made of how well the operator is planning, and how adequate the specified plans are for ensuring that management goals are achieved. Without a domain model, a method can only determine that an operator is planning. No assessment of the quality of those plans can be made. As the focus of the method is to support the development of technologies that will improve operator planning abilities, the domain model is crucial for determining existing planning ineffectiveness and how such ineffectiveness can be overcome through re-design.

To conclude, this paper tries to make explicit a number of important relationships that need to be established before the expression of effectiveness is possible. The Theory of the Operator Planning Horizon tries to make clear the behavioural phenomena of concern, and enables the clear derivation of requirements for data that constitute a representation of such an horizon. Likewise, the domain model enables the identification of problems of worksystem performance, and subsequent construction of a worksystem model to establish the behaviour that brought about less than the desired quality of work. Establishing relationships from the data serves to augment the subjective (anecdotal) recall of operator ‘problems’ with technologies, and offers some basis for quantifying problems, and the subsequent generation of priorities for re-design. Such a method is therefore considered a useful tool, to compliment existing design practices, to support the explicit expression of the magnitude of a design problem. As such, it is considered to have advanced the ‘design for effectiveness’ approach, since without a well-specified expression of the design problem, there can be no (known) design solution, nor acquisition and validation of design knowledge supporting the transition from one to the other (Long & Timmer, 2001). Phenomena-driven and human performance-driven approaches are unable to support such a design origin and transition.

VI. REFERENCES

Amalberti, R., & Deblon, F. (1992). Cognitive modelling of fighter aircraft process control: a step towards an intelligent on-board assistance system. International Journal of Man-Machine Studies, 36, 639-671.

Anderson, B. F., & Settle, J. W. (1996). The influence of portfolio characteristics and investment period on investment choice. Journal of Economic Psychology, 17, 343-358.

Barnard, P. (1991). Bridging between basic theories and the artifacts of human-computer interaction. In J. M. Carroll (Ed.), Designing interaction: Psychology at the Human-Computer Interface. Cambridge, UK: Cambridge University Press.

Blandford, A., & Young, R. M. (1993). Developing runnable user models: separating the problem solving techniques from the domain knowledge. In J. L. Alty, D. Diaper, & S. Guest (Eds.) People and Computers VIII, Proceedings of HCI’93. Cambridge, UK: Cambridge University Press.

Boudes, N., & Cellier, J-M. (1997). Evaluating the concept of anticipation range in air traffic control. Paper presented at the Ninth international Symposium of Aviation Psychology, Columbus, Ohio, May.

Boudes, N., & Cellier, J.-M. (1998). Étude du champ d’anticipation dans le contrôle du trafic aérien. Le Travail Humain, 6, 29-50.

Card, S. K., Moran, T. P, & Newell, A. (1983). The Psychology of Human- Computer Interaction. Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Cox, M. (1992). The cognitive aspects of the air traffic control task: a literature review. IAM Report No. 718. Psychology Divison, RAF Institute of Aviation Medicine.

David, H. (1997). Radical revision of en-route air traffic control. EUROCONTROL Experimental Centre Report No. 307. http:/www.eurocontrol.fr/public/reports/eecreports/1997/rep307

de Groot, A. D. (1978). Thought and choice in chess. The Hague: Mouton.

Dowell, J., & Long, J. (1998). Conception of the cognitive engineering design problem. Ergonomics, 41, 126-139.

Dowell, J. (1998). Formulating the cognitive design problem of air traffic management. International Journal of Human-Computer Studies, 49, 743-766.

Engeström, Y. (2000). Activity theory as a framework for analyzing and redesigning work. Ergonomics, 43, 960-974.

Flach, J. M. (1998). Commentary. Cognitive Systems Engineering: putting things in context. Ergonomics, 41, 163-167.

Hayes-Roth, B., & Hayes-Roth, F. (1979). A cognitive model of planning. Cognitive Science, 3, 275-310.

Hoc, J-M. (1988). Cognitive Psychology of Planning. London: Academic Press.

Hoc, J-M. (1993). Main features of the human supervision of a long response latency process. In Proceedings of the IEEE/SMC’93 Conference on Systems, Man and Cybernetics, pp. 114-119. Piscataway, NJ: IEEE.

Hollnagel, E. (1998). Commentary. Comments on ‘Conception of the cognitive engineering design problem’ by John Dowell and John Long. Ergonomics, 41, 160-162.

Hughes, J. A., Somerville, I., Bentley, R., & Randall, D. (1993). Designing with ethnography. Making work visible. Interacting with Computers, 5, 239-253.

Hutchins, E. (1990). The technology of team navigation. In J. Galegher, R. E. Kraut, & C. Egido (Eds) Intelligent Teamwork: Social and Technological Foundations of Cooperative Work, pp.191-220. Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Long, J., & Timmer, P. (2001). Design problems for research: what can we learn from ATM-like microworlds. Le Travail Humain, 64.

Meyer, D. E., & Kieras, D. E. (1999). A computational theory of executive cognitive processes and multiple-task performance: Part 1 Basic Mechanisms. Psychological Review, 104, 3-65.

Miaillier, B. (1998). ATM Strategy for 2000⁺: Volumes 1 and 2. EUROCONTROL Report FCO.ET1.ST07.DEL02. http://www.eurocontrol.be/ded/atmstrat

Miller, G. A., Galanter, E. & Pribram, K. H. (1960). Plans and the Structure of Behaviour. New York: Holt, Rinehart & Winston.

Moray, N. (1992). Mental models of complex dynamic systems. In P. Booth, & A. Sasse (Eds). Mental Models and Everyday Activities, Second Interdisciplinary Workshop on Mental Models, pp. 103-132. Cambridge, UK, March.

Newell, A., & Simon, H. A. (1972). Human Problem Solving. Englewood Cliffs, NJ: Prentice-Hall.

O’Hara, K, P., & Payne, S. J. (1999). Planning and the user interface: the effects of lockout time and error recovery cost. International Journal of Human-Computer Studies, 50, 41-59.

Payne, S. J., Squibb, H. R., & Howes, A. (1990). The nature of device models: the yoked state space hypothesis and some experiments with text editors. Human-Computer Interaction, 5, 415-444.

Pietras, C. M., & Coury, B. G. (1994). The development of cognitive models of planning for use in the design of project management systems. International Journal of Human-Computer Studies, 40, 5-30.

Rasmussen, J. (1986). Information processing and human-machine interaction: An approach to cognitive engineering. New York: North Holland.

Rasmussen, J. (1992). The ecology of work and interface design. In A. Monk, D. Diaper, & M. D. Harrison (Eds.), People and Computers VII. Proceedings of the HCI’92 Conference. Cambridge, UK: Cambridge University Press.

Reason, J. (1998). Commentary. Broadening the cognitive engineering horizons: more engineering, less cognition and no more philosophy of science, please. Ergonomics, 41, 150-152.

Sacerdoti, E. D. (1977). A Structure for Plans and Behaviour. New York: Elsevier.

Shallice, T. (1982). Specific impairments of planning. Philisophical Transactions of the Royal Society of London, B 298, 199-209

Suchman, L. (1987). Plans and situated actions: the problem of human- machine communication. Camridge, UK: Cambridge University Press.

Suchman, L. (1993). Response to Vera and Simon’s situated action: a symbolic interpretation. Cognitive Science, 17, 71-75.

Timmer, P. (1999). Expression of operator planning horizons: a cognitive engineering approach. PhD Thesis, University of London.

Timmer, P., & Long, J. (1997). Separating user knowledge of domain and device: a framework. In H. Thimbleby, B. O’Conaill & P. Thomas (Eds.) People and Computers XII. Proceedings of HCI’97, pp379-395. London, UK: Springer.

Timmer, P., & Long, J. (2000). Plans versus outcomes: establishing the costs of planning. In E. Hollnagel, P. Wright, & S. Dekker, (Eds.) Proceedings of the Tenth European Conference on Cognitive Ergonomics, pp158-165. Linkoping, Sweden: European Association of Cognitive Ergonomics.

Vera, A. H., & Simon, H. A. (1993). Situated Action: a symbolic interpretation. Cognitive Science, 17, 7-48.

Vincente, K. J. (1998). Commentary. An evolutionary perspective on the growth of cognitive engineering: The Risø genotype. Ergonomics, 41, 156-159.

Whitfield, D., & Jackson, A. (1982). The Air Traffic Controller’s ‘Picture’ as an Example of a Mental Model. In G. Johannsen, & J. E. Rijnsdorp, (Eds.) Analysis, Design, and Evaluation of Man-Machine Systems, pp. 45-52. New York: Pergamon.

Wong, W., Sallis, P., & O’Hare, D. (1997). Eliciting portrayal requirements: experiences with the critical decision method. In H. Thimbleby, B. O’Conaill, & P. Thomas, (Eds.) People and Computers XII. Proceedings of HCI’97, pp. 397-415. London, UK: Springer.

Woods, D. D., & Roth, E. M. (1988). Cognitive Systems Engineering. In M. Helander (Ed.), Handbook of human-Computer Interaction, pp. 3-43. Amsterdam: North Holland.

Woods, D. D. (1988). Commentary: Cognitive Engineering in complex dynamic worlds. In E. Hollnagel, G. Mancini, & D. D. Woods (Eds.) Cognitive Engineering in Complex Dynamic Worlds. London: Academic Press.

Woods, D. D. (1998). Commentary. Designs are hypotheses about how artifacts shape cognition and collaboration. Ergonomics, 41, 168-173.

Young, R. M. (1983). Surrogates and mappings: two kinds of conceptual models for interactive devices. In D. Gentner, & A. L. Stevens (Eds.), Mental Models, Ch. 3. Hillsdale, New Jersey: Lawrence Erlbaum Associates.

* Ergonomics & HCI Unit, University College London, 26 Bedford Way, London, WC1H 0AP (j.long@ucl.ac.uk). Peter Timmer is now with Cambridge Technology Partners (UK) Ltd, Avalon House, 72 Lower Mortlake Road, Richmond-upon Thames, Surrey, TW9 2JY (peter.timmer@ctp.com)

Planning for Multiple Task Work – an Analysis of a Medical Reception Worksystem https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g April 8, 2011 November 16, 2017

Planning for Multiple Task Work – an Analysis of a Medical Reception Worksystem

John

April 8, 2011

Becky Hill, John Long, Walter Smith and Andy Whitefield

Ergonomics and HCI Unit, University College London,
26 Bedford Way, London WCIH OAP

ABSTRACT

This paper presents an investigation of interactive worksystem planning in the multiple task work domain of medical reception. In an observational study of a medical reception worksystem, three different types of plan were identified: the task plan, the procedure plan and the activity plan, These three types of plan were required for effective working in the domain of medical reception, because of the many similar concurrent tasks, the frequency of behaviour switching between tasks and the need for consistency within the worksystem. It is proposed, therefore, that to design effective interactive human-computer worksystems for the domain of medical reception (and possibly for other work domains of a similar nature), the designer must specify the three different types of plan and the relationships between them. The three types of plan in medical reception are discussed in the context of design issues such as the allocation of planning structures.

KEYWORDS

medical reception; planning and control; multiple tasks.

1 INTRODUCTION

This paper presents an observational study of the plans and planning behaviour of a medical reception worksystem. The study was carried out to develop further an existing design-oriented framework of the planning and control of multiple task work (PCMT) (Smith, Hill, Long and Whitefield, 1993 [5]). Section 1 provides some background information about medical reception (MR), and identifies it as a ‘PCMT design problem’. Section 2 describes the particular medical reception worksystem studied and how the observational data were collected and analysed. Section 3 contains the resulting model of PCMT in medical reception, while Section 4 presents more detailed accounts of the three dtifercnt types of plan used by the medical reception worksystem. The model is intended to have appropriate content to aid in reasoning about design, but is not yet in a suitable form for use within an existing design methodology. Section 5 identifies design issues addressed by the model.

1.1 Medical Reception (in the UK)

Informally, we can identify medical reception worksystems as those interactive systems, comprising combinations of people and office devices, which support the effective interaction between medical practitioners and their patients in medical general practices.

Jeffreys and Sachs (1983) [1] have described the emergence of medcal reception worksystems in the UK. In 1966, there was a boost to the employment of receptionists and secretaries, because the Family Doctors Charter was implemented, which gave provision for GPs to reclaim 70% of the salaries paid to their staff, Closely related to the increasing employment of receptionists was the growth in the use of appointment systems in general practice, as an appointment system could not be implemented without the employment of receptionist staff. General practices have begun in the last few years to be computerised, however the number has been small. The British government has more recently introduced a scheme of partial reimbursement of computer costs to increase computerisation.

Medical reception, therefore, presents an example of what might be described as an emerging Human Computer Interaction (HCI) design problem. Following the approach of Dowell and Long (1989) [2], the medical reception HCI design problem might be stated as: to specify the structures and behaviors of a human-computer interactive medical reception worksystem which will carry out work in the domain of medicalreception to a desired level of performance.

1.2 Medical Reception as an Instance of the Planning and Control of Multiple Task Work

There are many different issues to be addressed in the design of medical reception worksystems. The ‘set of issues addressed in this paper are those concerning PCMT. The general aim of the present research is to construct an appropriate model to aid designers reasoning about alternative solutions to this medical reception- PCMT design problem. The aim of the observational study reported here was to investigate the types of planning and plans used by medical reception worksystems to carry out work effectively.

The computerisation of worksystems typically increases the speed with which simple routine activities can be accomplished, e.g. searching for data, compiling revised/updated tables of information and their communication. The changing nature of routine activities has consequences for the management and supervision of work. Some of the most challenging human factors design issues for computerised systems, therefore, concern these higher-level behaviors which are here referred to as planning and control. The design of planning and control behaviors is particularly important where the worksystem carries out several ongoing tasks concurrently.

1.3 A Design-Oriented Framework of PCMT-MR

The notions of multiple task work and planning and control used in this paper are based on a previously constructed PCMT framework (Smith, Hill, Long and Whitefield, 1992a [3]; 1993 [5]). This section briefly outlines a PCMT-MR framework, the application of the PCMT framework to medical reception, in sufficient detail to understand the resulting model presented in Section 3.

The ‘PCMT-MR framework is based on Dowell and Long’s (1989) conception for an engineering discipline of HCI which expresses the HCI general design problem. The conception makes a fundamental distinction between an interactive worksystem, comprising one or more users and computers, and its domain of application, comprising the transformations carried out by the worksystem which constitute its work. The effectiveness with which work is carried out is expressed by the concept of performance which can be defined as a function of two factors: the quality of the product (i.e. how well the desired state of the domain is achieved compared with the state specified in the goal); and the incurred resource costs (i.e. the resources required by the worksystem in accomplishing the work).

The interactive worksystem, its domain of application and performance.

In medical reception, the worksystem is the receptionist plus devices such as an appointment book, telephone and prescription filing system, a wider notion of worksystem used in order to analyse to-be-computerised systems. The medical reception domain is conceptualised as the provision of support for medical cases, i.e. patients consulting with medical practitioners. Medical reception performance concerns the effectiveness with which support is provided for the medical cases.

Multiple task work.

The medical reception domain is an instance of multiple task work since support is given concurrently for multiple ongoing and temporally overlapping medical cases. A single medical reception task is the transformation of a single medical case object comprising a patient object, medical practitioner object(s), diagnosis object(s) and treatment object(s). This task might require a diverse range of behaviors spread over a long period of time, for example arranging a suitable appointment for patient P, notifying patient P of test results.

Planning and cotttrol behaviour.

It has been argued elsewhere (Smith et al, 1992b, [4]) that for an adequate

characterisation of the planning and control structures of worksystems which carry out work in complex and dynamic domains, it is necessary to make explicit the relationship between planning, control, perception and execution behaviors. Planning, in medical reception, entails specifying how medical case objects are to be supported by specifying either required transformations of medical case objects andlor required behaviors. Control entails deciding which behaviour to carry out next, such as arranging an appointment for patient P1 or preparing notes for P2. Perception and execution behaviors are, respectively, those whereby the medical reception worksystem acquires information about the medical case objects and those whereby it provides the required support.

Cognitive structures and allocation of function.

The PCMT-MR framework expresses the worksystem at two levels of description. Firstly, the framework describes the cognitive structures of the worksystem, expressed as four processes – perceiving, planning, controlling and executing – and two representations – knowledge-of-thetasks

and plans. This relationship is illustrated in more detail in the description of the PCMT-MR model (Section 3). Secondly, the framework describes the distribution of these cognitive structures across the physically separate user and devices of particular worksystems. The framework therefore allows the construction of alternative models of the distribution of cognitive structures across the user and devices, and thus, it supports reasoning about allocation of function.

2 AN OBSERVATIONAL STUDY OF MEDICAL RECEPTION

This Section describes an observational study of a medical reception worksystem. The aim of the observational study was to investigate the types of planning and plans used by medical reception worksystems to carry outwork effectively.

2.1 The Medical Reception Worksystem

The medical reception worksystem chosen for the study supported the provision of medical care in a general practice with four doctors and two nurses. This worksystem was physically divided into two different workstations, with two receptionists working from a ‘front desk’ and a ‘back desk’. The front desk workstation comprised a receptionist and devices, such as a telephone, and an appointments book. The back desk workstation comprised a second receptionist and devices, such as a prescription book, telephone and a computerised database. The front desk was positioned in front of a hatch through which the receptionist interacted with patients arriving at the surgery Under guidance of the receptionist patients passed from the hatch to a waiting room before seeing a medical practitioner.

2.2 The Nature of the Medical Reception Domain

As described in Section 1.3, the medical reception domain involves multiple task work. These tasks are characterised by:

(i) welldefined, routine sub-tasks;

(ii) variable durations, of between one day and several weeks

(iii) a high frequency of autonomous events; that is, task-relevant events which occur independently of any worksystem behaviour, for example: the arrival of a patient at the hatch or an incoming telephone-call.

Figure 1 The Model of PCMT-MR

worksystem structures: domain of application:

representations and processes multiple task work

2.3 Data Collection

Video-recordings were taken of the two workstations. Two video cameras were used simultaneously, one camera focused on the appointment-booking system of the front desk, while the other camera recorded the interactions within the whole reception area including both desks. Video-recordings were taken, both during and outside surgery hours, for one morning and afternoon in which time one pair of receptionists was relieved by another. At a later date, after initial analysis, an interview was carried out with one receptionist, to obtain clarification of selected details concerning the work. Only the analysis of video-recordings is reported here, although this analysis was assisted by the interview.

2.4 Data Analysis

Only the two videos recorded in the morning were analysed, because sufficient data were gathered from these two videos. The following analysis was carried out on both videos. From the 240 minutes of video-recording a sequence of between 30-90 minutes was selected for analysis. This selection was based mainly on the criteria that (i) the observed behaviors were interpretable, and (ii) the analysed period appeared to be busy in support of medical cases (and so was presumed to include behaviors of interest).

The first stage of the analysis was the documentation of behaviors and task-related events to a level of description considered to be at, or below, that necessary for the identitlcation of planning and control behaviors. This first description allowed the identification ofi (i) a low-level description of the physical domain objects (e.g., prescription) ; (ii) a low-level description of the physical worksystem devices (e.g. ‘phone 1; prescription box). Behaviours and events were documented in chronological sequence in a manner illustrated, as follows:

telephone 1 BUZZ

receptionist 1 PICK UP pencil

receptionist 1 PICK UP receiver ‘phone 1

receptionist 1 SELECT-LINE ‘phone 1

receptionist 1

(over telephone): Hello can I help?

event: P PUT prescription in prescription box

receptionist 1

(over telephone) Dr S?

From this first analysis, it was possible to identify sequences of behaviors which were generic to particular activities carried out by the receptionist worksystem, for example appointment-booking.

3 A MODEL OF PCMT-MR

Section 1.3 described the conceptual framework of PCMT-MR constructed prior to carrying out the observational study outlined in Section 2. Sections 3 and 4 now describe the model of PCMT-MR constructed by using the observations of medical reception tasks and behaviors to instantiate the concepts of the framework.

The modelling of the observations of medical reception can be divided into two parts. Frost, a description of the medical reception worksystem and domain was generated, which is presented in this section. Second, a detailed description of the observed plans was constructed, presented in Section 4. Figure 1 provides a selective overview of the model of PCMT-MR which is now described.

al case

3.1 The Medical Reception Worksystem

The expression of the medical reception worksystem in Figure 1 shows cognitive structures taken from the PCMT-MR framework (described in Section 1.3). These cognitive structures are expressed at a generic level; that is, they depict the cognition of the medical reception works ystem prior to any allocation of function between the receptionist and devices. The relationships between the cognitive structures in Figure 1 embody the definitions of the planning and control behaviors described in Section 1.3. (For more detail see Smith et al, 1992b [4]). The plan representation structure in Figure 1 has been ‘opened-up’ to show the dtiferent types of plan identified in the study which are described in detail in Section 4.

3.2 The Medical Reception Domain

Following tbe PCMT-MR framework, the medical reception domain is expressed as those objects whose transformation constitutes the work of medical reception. Thus, the domain contains multiple medical case objects, each medical case object comprising a patient object, medical practitioner object(s), diagnosis object(s) and treatment object(s). Each task constitutes the transformation of a single medical case object with respect to the values of a number of attributes. In order to transform the medical case object attributes, the attributes of the medkxd case sub-objects (which are in a part-whole relationship with the medical case object), must be transformed. Tables 1-3 describe the transformation of the objects associated with the sub-task of appointment-booking. One of the attributes of a medical case object which must be transformed is appointment suitability for the patient. To transform the value of this attribute, the values of some of the attributes of the patient sub-object must be transformed.

The study revealed that the required transformation of each medical case object could be divided into a number of sub-transformations concerning particular sets of attributes. The division of the tasks into sub-transformations was consistent across all the tasks, and therefore the sub-transformations could be labelled generic sub-tasks. The generic sub-tasks identified in this study of medical reception were: appointment-booking, preparation of repeat prescriptions, registration of new patients, preparation (and updating) of medical notes for medical practitioners, and notification of patients test results.

Associated with the set of identified generic sub-tasks there were a corresponding set of activities. An activity is that set of behaviors which carry out a generic subtask.

The activities identified in medical reception were: booking of appointments, preparing repeat prescriptions, registering new patients, preparing (and updating) medical notes for medical practitioners, and notifying patients of test results. Due to limitation of space, only appointment-booking will be described in detail.

Figure 1 shows only those attributes which apply to the generic sub-task of appointment-booking. Attributes may be affordant or dispositional. Affordant attributes are transformed by tie worksystem; their transformation constitutes the work done. Dispositional attributes are relevant to the work but their transformation does not itself constitute work (often d@ositional attributes do not change their values). The attributes marked with an asterisk (*) in Figure 1 and Tables 1-3 are dispositional, for appointment-booking.

4 PLANS AND PLANNING IN THE MEDICAL

RECEPTION WORKSYSTEM

Following the PCMT-MR framework, plans are representations of how tasks are to be accomplished, specified to some level of completeness, some level of detail and in some format. In the study of medical reception, it was possible to identify three different plans employed by the worksystem. This section describes these three plans in turn and shows how they were interpreted as instances of three general types of plan: a task plan, an activity plan and a procedure plan.

4.1 The Task Plan

The receptionists used two appointment books (one for doctors and one for nurses) to represent and record details of patient appointments with the medical practitioners. Figure 2 schematically depicts the information represented in the appointment book for doctors: names of patients occupying particular appointment slots; whether or not the patient had entered the waiting room; slots which were still available; slots which the medical practitioners wanted to be left open; slots which could be used in emergencies. The receptionists also used what can be called ‘mental markers’; that is, they made mental notes of temporarily significant appoinixnent slots, such as the next available appointment of a particular medical practitioner or a slot which was in the process of being offered to a patient but not yet accepted.

From other perspectives, the appointment books might be regarded as plans for the whole practice. In the present analysis, the appointment books plus the associated mental markers were regarded as plans of the medical reception worksystem because they guided its behaviour, for example, they represented the patients whose medical notes needed to be prepared for the doctor, and the patients who should be let into the waiting-room. In terms of the PCMT-MR framework, the appointment books were plans which represented information about domain object attribute values. Specifically, they represented information about the patient object attributes of appointment-time and appointment-practitioner, and medical practitioner object attributes of availability (see Figure 1).

The information represented in the appointment books was specific to particulru objects, i.e. patients and medical practitioners, in the medical reception domain and was therefore specific to particular tasks, i.e. transformations of medical cases. The appointment books, with associated mental markers, were therefore identified as instances of a generic type of plan – the task plan. In general, task plans are specifications of either behaviors or domain object transformations relating to specific task instances. The appointment books were therefore paftial task plans.

4.2 The Activity Plan

As described in Section 3.2, the medical reception worksystem carried out a number of different activities, e.g. appointment-booking, preparing medical notes. From the video-recording and interview, it was possible to identify that the receptionists had a shared daily schedule of activities, mentally represented, to be carried out by the front and back desk receptionists. Figure 3 shows the activity schedule of the observed medical reception worksystem on the day of recording. This schedule was not rigidly adhered to as many activities, such as notifying of test results, were carried out in direct response to autonomous events such as patients telephoning the surgery.

The information represented in the activity schedule was specific to the carrying out of particular activities, as opposed to particular tasks. The activity schedule was therefore identified as an instance of a generic type of plan – the activity plan. In general, activity plans are specifications of sequences of activities to be carried out where each activity is a set of behaviors relating to a particular generic sub-task of the domain (see Section 3.2).

4.3 The Procedure Plan

Through analysis of the video-recordings, supported by interviews, it was possible to identify that the receptionist went through well-established sequences of behaviors when carrying out a particular activity. Thus the receptionists had mental routines, with in-built conditionals, for carrying out each activity, such as preparing medical notes, booking of appointments, preparing repeat prescriptions, etc. These mentrd routines, which represented information about behaviors and their contingencies for particular activities, were identified as instances of a generic type of plan – the procedure plan. In general, a procedure plan specifies an effective sequence of behaviors, and their contingencies, for carrying out a particular activity which relates to a generic sub-task of the domain (see Section 3.2).

As an illustration, the procedure plan for booking of appointments is now deseribed in detail. Figure 4 shows a flow diagram of behaviors, with associated conditionals, carried out in the activity of booking of appointments. The conditionals imply other behaviors; for example, the fmt conditional in Figure 4 implies that the controlling process must initiate the behaviour of reading the contents of Knowledge of tasks and, if necessary, to perceive the patient’s requirement for appointment time (see Figure 1). Thus this procedure plan for booking of appointments describes the

behaviors of the worksystem in terms of both the

planning, control, perception and execution behaviors

and the transformation of the medical case objects that

constitute the generic sub-task of appointment-booking

(see Section 4.4).

4.4 The Relationship between the Different Plans

The following scenario of an appointment being booked

illustrates the relationship between the three plans shown

in Figure 1, and shows how they operated in combination

to guide the worksystem’s behaviour.

At the beginning of the day, the controlling process reads the activity plan – which specifies that receptionist R should carry out booking of appoinlxnents from the frontdesk during the morning (Figure 3) – and sets the parameters of the perceiving, executing and planning processes appropriately.

Later, an autonomous event occurs associated with thedomain: patient P telephones the surgery requiring an appointment. The controlling process then reads from the procedure plan for booking of appointments (Figure 4) which guides control decisions to activate the following sequence of behaviors:

– Perception: perceiving the values of patient P’s attributes and updating knowledge-of-tasks: with the following attribute values:

appointment-requirements-who: own Dr (Dr X)

appointment-requirements-when: today

problem type: not emergency

– Planning: selecting and (mentally) marking a possible appointment slot in the task plan (i.e. the appointment book): Dr X, time t

– Execution: offering the selected appointment to patient P, i.e., attempt to transform Ps attribute values to: appointment-practitioner Dr X; appointment-time: time t

Perception: updating knowledge-of-tasks to register the acceptance of the appointment and patient Ps name.

– Planning: adding a representation of the agreed appointment to the task plan

– Perception: confirming the appointment details with patient P.

5 THE ROLE OF THE MODEL IN SUPPORTING THE DESIGN OF MEDICAL RECEPTION WORKSYSTEMS

The study of medical reception showed how the worksystem used three types of plan to carry out its work effectively. The relationship between the use of these different plan types and performance, i.e. the effectiveness with which the multiple task work was carried out will now be described along with their implications for the design of interactive worksystems.

– The Task Plan: observed in the form of patient appointment books, supported the effective carrying out of the many ongoing tasks by:

1) giving guidance for the carrying out of behaviors relating to specific tasks, e.g. whether to admit patient PI to the waiting room, preparing medical notes for P2;

2) co-ordinating different tasks e.g. ensuring that appointments were unique for each task.

l The Activitv Plan: observed in the form of a (mentally represented) daily schedule of activities, supported the effective carrying out of tasks by:

1) supporting large-scale sharing of effort across separate tasks; e.g., when carrying out the activity of preparing repeat prescriptions, all of the medical notes for the patients requiring repeat prescriptions would be collected together at one time, thus reducing the behavioral costs to the worksystem;

2) co-ordinating the activities with the task-relevant changes in the domain; e.g., the activity of preparing repeat prescriptions was carried out during surgery hours, so that the prescriptions were ready for the doctors to verify and sign when the surgeries finished.

– Procedure Plan: observed in the form of mental routines, supported the effective carrying out of repetitive sub-tasks which were generic across tasks (such as booking of appointments) by:

1) providing quick responses in a domain where there was a very high frequency of autonomous events (patients arriving, incoming telephone calls);

2) maintaining consistency which supported the rotation of the four receptionists around the two medical reception workstations;

3) supporting shared user behaviour, such that if one workstation was left unattended because a receptionist was busy the other receptionist on that shift could take over at the unattended workstation.

In computerizing, and therefore redesigning, the medical reception worksystem described in this paper, a designer should specify how the identified plans will be supported in the new design. For example it may be advisable to reallocate some of the mental plans to computerised devices, by:

i) having a partial procedure plan for booking of appointments device-based

ii) incorporating the currently used mental markers into a device-based appointment book. These two examples would enhance the effectiveness of the worksystem by aiding in the training of new receptionists and reducing their mental workload.

Therefore, in general, designs for medical reception

should Specify

(i) instances of all 3 plan types

(ii) the relationship between the different plans

(iii) the allocation of the plans across the receptionist and

the physically separate devices of the worksystem.

The generality of the plan types identified in the study reported here is uncertain at present. However, it might be suggested that the same issues will arise in the design of worksystems which carry out work in multiple task domains which are similar in nature to that of medical reception.

ACKNOWLEDGEMENT

The work reported herein was supported by the Joint Councils Initiative in Cognitive Science/IICI, grant no: SPG 8825634.

REFERENCES

[1] Jefferys, M. and Sachs, H. Rethinking General Practice: Dilemmas in Primary Health Care. London Tavistock 1983.

[2] Dowell J. and Long, J. Towards a conception for an engineering discipline of human factors. Ergonomics,32, (1989), 1513-1536.

[3] Smith, M.W., Hill, B., Long, J.B. and Whitefield, A.D. The Planning and Control of Multiple Task Work a Study of Secretarial Office Administration. In Proceedings of the Second Interdisciplinary Workshop on Mental Models, Cambridge, (1992a), 74- 83, in press.

[4] Smith, M.W., Hill, B., Long, J.B. and Whitefield, A.D. Modelling the Relationship Between Planning, Control, Perception and Execution Behaviors in Interactive Worksystems. In D.Diaper, M.Harrison and A.Monk (Eds) People and Computers VII; Proceedings of HCI ’92. Cambridge University Press, 1992b.

[51Smith, M.W., Hill, B., Long, J.B. and Whitefield, A.D. A Design-Oriented Framework of the Planning and Control of Multiple Task Work. Submitted for publication, 1993.

HCI is more than the Usability of Web Pages: a Domain Approach https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g May 23, 2011 December 9, 2012

HCI is more than the Usability of Web Pages: a Domain Approach

John

May 23, 2011

John Long

Emeritus Professor, Ergonomics and HCI Unit, University College London, 26, Bedford Way, London, WC1H OAP, UK ( Now at UCL Interaction Centre)

Keywords: HCI Engineering, Domain of work, Design problems

Abstract

Usability continues to be central to Human-Computer Interaction (HCI), in spite of recent developments associated with user experience, user pleasure, etc. The Internet continues to dominate the introduction of a wide range of new technologies and rightly has become central to HCI. One current view of HCI, then, is in terms of the usability of Web pages. This view is appealing and there can be no doubt that usable Web pages would be a laudable achievement for HCI This paper supports such a view, but argues that it is too limited. HCI, as an engineering discipline, supports user interaction design, but for effectiveness. The latter necessarily includes usability – how easy it is to use a computer, but also task quality – how well the work is performed by the interactive user-computer worksystem. Easy-to-use computers may produce poor work, hard-to-use computers may produce good work. Usability alone fails to distinguish between these two cases, but differentiation is essential for HCI design and evaluation, because both usability and task quality need to be as desired. This paper proposes a domain approach to make good the limitation of HCI as the usability of Web pages. The domain of work is represented as object, attributes, values and their desired changes – task quality. Implicit domain models (medical reception; military command and control; and domestic energy management) and explicit domain models (off-load planning; emergency management; and air traffic management) are described. Each model is illustrated in terms of its potential support for design as the diagnosis of design problems and the prescription of design solutions, both as they relate to the task quality of the associated domains of work. It is concluded that HCI is indeed more than the usability of Web pages and that the domain approach to making good the limitation shows promise. In particular, implicit domain models are considered to support design in the short-term, while explicit domain models are considered to support design and research in the long-term.

1. Introduction

Usability and Web pages continue to be central to HCI. However, the view that HCI is no more than the usability of Web pages is rejected as too limited. Nevertheless, usable Web pages constitutes a laudable goal for HCI. This paper proposes a domain approach to HCI to make good this limitation.

1.1 Usability and HCI

HCI has evolved considerably since those early days. A number of developments have extended its scope. For example, computer-supported co-operative work (CSCW) emphasises the interdependance between computer users. Ubiquitous computing (Ubicomp) emphasises the variety of computing devices and their locations. Virtual reality (VR) emphasises the importance of ‘presence’ in its simulations. However, in spite of these developments, HCI continues to retain usability as a critical expression of performance.

The concept of usability has also evolved since the early days of HCI. Pleasure has been proposed as a concept, relating people to computing products, which goes beyond usability (Green and Jordan, 2002). User experience claims to be a more complete expression of the same relationship (Light & Wakeman, 2001). However, these new concepts do not claim that usability is unimportant, or even unnecessary, only that there are additional people-product relations. Usability, then, remains central to HCI. It is concluded that in spite of extensions to both the scope of HCI and usability, over the years since their inception, the latter remains central to the former.

1.2 Web Pages and HCI

New technologies continue to characterise and to extend the scope of HCI. Personal computers soon became networked. Broadband communication integrated data processing and communications. Multi-media synthesised text and graphics. Multi-modality combined user inputs from speech, gesture and keyboard. Other new technologies include: distributed networks; mobile computing and telephony; smart buildings; wearable computing; augmented reality, etc. However, some of these new technologies are only in their infancy. Other technologies have proved less successful than originally expected. Yet other technologies have achieved only modest market penetration. In contrast, the new technology, that is the Internet, has continued to develop, offers a number of successful services (although not all are succesful) and has achieved deep market penetration. It is unsurprising, then, that Web pages have rightly become central to HCI.

1.3 Usability, Web Pages and HCI

Given the centrality of both usability (Section 1.1) and Web pages (Section 1.2) to HCI, one might be forgiven for thinking that HCI is essentially the usability of Web pages. This view is appealing and there can be no doubt that usable Web pages would be a laudable and notable achievement for HCI. The notability would derive both from the importance of the achievement for users and from the increased competition of other professionals, such as graphic designers, software engineers, marketeers and financiers to influence the design and evaluation of interactive computer systems. The view that HCI is the usability of Web pages is shared here. However, the proposition is understood as ‘HCI is not less than the usability of Web pages’, but not ‘HCI is only the usability of Web pages’. The scope of HCI must include usability, a primary expression of performance and Web pages, the dominant New Technology. HCI, then, is not limited to, but more than the usability of Web pages. The aim of this paper is to propose an approach – that of the domain – for HCI to go beyond the usability of Web pages.

2. General Domain Approach to HCI

The limitations of the view that HCI is no more than the usability of Web pages are analysed in terms of: completeness; coherence; and fitness-for- purpose. A domain approach is proposed, which makes good these limitations in terms of the three criteria.

2.1 Limited View of HCI

It has been argued earlier that the view of HCI as only the usability of Web pages is too limited. It is worth setting out these limitations in more detail. The latter both provide support for the argument and set up criteria against which to assess alternative approaches to HCI -such as that of the domain, as proposed here. The criteria are: completeness; coherence; and fitness-for-purpose. These criteria derive from the concept of a discipline, whose knowledge, acquired by research, supports its practices (Long, 1996).

First, the usability of Web pages is an incomplete expression of HCI performance. Ideally, usable interactive user-computer systems also achieve their goals. However, HCI evaluations routinely identify usable systems, which fail to achieve their goals and hard-to-use systems, which do achieve their goals. Usability, then, can only constitute one aspect of performance. There also needs to be some expression of the work an interactive system performs and how well it performs that work. The usability of Web pages is too limited a view of performance. If the Web pages provide a service, such as e-commerce, performance needs to express how well the service is effected, for example, goods purchased.

Usability also fails to reflect the computer’s contribution to performing work. For example, a hard-to-use system might be made more easy to use by simplification of the interface. Simplification might be achieved by reducing inappropriate computer behaviours (and so the code supporting them) or increasing appropriate behaviours (and so the code supporting them). In both cases, usability is the same, but the computer contribution differs. This difference should be reflected in a more complete expression of the interactive worksystem’s performance.

Second, the usability of Web pages is an incoherent expression of HCI performance. Usability appears to be a property of the computer. For example, a hard-to-use computer can be re-designed to be easier to use. But computers can also become easier to use by training, learning and practice with no change in the computer itself. Training, learning and practice all modify properties of the user and not the computer. So, usability would be more coherently expressed, if it referred to the user contribution to the interaction. For example, if usability were expressed as effort, or somesuch, the latter might vary as a function of re-design, training, etc. Third, and last, the usability of Web pages is not fit-for-purpose for expressing HCI performance. HCI, as an engineering discipline (Long and Dowell, 1989), acquires and validates knowledge by research to support HCI design. HCI design (and evaluation) can be understood as the diagnosis of design problems and the prescription of design solutions. HCI, then, attempts to ‘design for effectiveness’, indeed to ‘design users interacting with computers to perform effective work’, (Dowell and Long, 1989 and 1998). The usability of Web pages fails to express such effectiveness and so is not fit-for-purpose for supporting HCI performance.

In summary, then, the usability of Web pages is neither complete, coherent, nor fit-for-purpose for expressing HCI performance for an HCI discipline intent on designing for effectiveness. Those three criteria need to be met by any alternative approach to HCI, including the domain approach, proposed here.

2.2 Particular Domain Approach to HCI

The domain approach to HCI originates in proposals made by Dowell and Long (1989 and 1998). According to the latter, HCI is an engineering discipline, whose research acquires knowledge to support the solution of the general problem of design, having the particular scope of users interacting with computers to perform effective work. Following Dowell and Long, users interacting with computers constitute an interactive worksystem, composed of at least two separate, but interacting, sub-systems, namely users and computers. Such interactive systems have a domain of application. The domain of work is where work originates, is performed and has its consequences. Goals are allocated to worksystems by organisations. A domain is distinct from, and delimits, a worksystem. Worksystems are designed to perform effective work.

Work is conceived as comprising one or more objects, constituted of attributes, which have values. Goals express a requirement for change in the values of these attributes. Interactive subsystems consist of user behaviours interacting with computer behaviours. These behaviours are supported by mutually exclusive user structures and computer structures and are executed to perform work effectively. Effectiveness is expressed by the concept of performance, that is, how well a system achieves its goals – task quality, and the system costs that are incurred in so doing. Costs are incurred by both the user and the computer, may be physical and mental/cognitive/abstract and can be thought of as workload – for present purposes. The domain approach thus adds the separate concept of work to the system effecting the work. In addition, it applies the concept of work as object/attribute/value transformation (by the worksystem) to performance. It further distinguishes user and computer workloads. The conception is entirely motivated and rationalised by the requirements of an HCI engineering discipline to support design for performance.

This view of HCI is considered to meet the three criteria of completeness, coherence and fitnessfor- purpose, identified earlier and so make good the limitations of the view of HCI as only the usability of Web pages.

First, the view is complete. Performance of an interactive work-system is expressed in terms of task quality – how well the work is carried out and worksystem costs, the workload involved in carrying out the work that well. System workload is composed of user workload and computer workload. If usability is considered to be expressed by user workload, then task quality and computer workload are the additional concepts, required for a complete view of HCI.

Second, the view is coherent. User workload is a property of the user; computer workload is a property of the computer. An increase or decrease in the one can be accompanied by an increase or decrease in the other. In general, the trend in current design is to increase computer workload for a decrease in user workload (given the decreasing cost of computer hardware and software). This current trend in design is well expressed by differentiated worksystem costs, but not by usability, either alone or as a presumed property of the computer. Further, design, training or user selection can all be expected to change user workload, but only design can change computer workload.

Third, and last, task quality and differentiated system workload are fit-for-purpose for expressing HCI performance, because design problems can be diagnosed as ineffective performance, and thus design solutions can be prescribed as effective performance. HCI, as an engineering discipline, can, be supported in its intent ‘to design for effectiveness’. Performance-driven design is taken to be the hall-mark of an engineering discipline (Newman, 1994) In conclusion, then, the domain approach to HCI proposed here meets the three criteria of completeness, coherence and fitness-for-purpose that were used to identify the limitations of the view of HCI as only the usability of Web pages. This domain approach to HCI is now illustrated.

3. Domain Approach Illustrations

Each illustration of the domain approach to HCI describes a domain model and its potential to support design as the diagnosis of design problems and the prescription of design solutions. Only task quality is illustrated as an expression of performance. User and computer costs as workload refer to the worksystem and not the work. User workload is a more coherent expression of usability. Only task quality derives from the domain and so goes beyond usability. Hence, the focus of the illustrations on task quality.

3.1 Implicit Domain Model Illustrations

Modelling the domain of work, following the domain conception of Dowell and Long (1989), necessarily produces an explicit domain model. However, the domain conception can also be used informally to identify domain aspects, implicit in descriptions of interactive worksystems. Such descriptions derive from domain expertise and may be provided by domain experts or worksystem documentation, such as operational procedures, training manuals, etc. Implicit domain models are important for three reasons. First, they are readily available, given access to domain experts and/or worksystem documentation. Second, implicit domain models can be used to support design and evaluation as illustrated here. Third, such models provide the starting point for developing explicit domain models. Implicit domain models use a wide range of representations, including text, diagrams, etc. Note that because the domain models are implicit, these representations include descriptions of the worksystem and other aspects, excluded from the domain conception of Dowell and Long.

3.1.1 Domain of Medical Reception

Domain models of Medical Reception (MR) are implicit in the documentation of medical reception itself (for example, Drury, 1981; Jeffreys and Sachs, 1983, etc.) MR involves combinations of people and office devices (including computers, at least in the UK) in the support of interactions between medical practitioners and their patients in medical general practices. Receptionists are central to the organisation of MR. For example, by 1981 over 70% of medical general practitioners in the UK employed receptionists to operate doctor patient appointment systems, in a group practice (comprising three or more doctors), during surgeries (that is, when the doctors see patients). Most of the receptionists’ time is spent dealing with: requests for surgery appointments and home visits, either by telephone or in person; patients, who turn up with or without appointments; telephone requests to speak to doctors and other medical health workers; registration of new patients; and other patient enquiries and complaints. When making an appointment, the receptionist tries to satisfy the patients’ request for a particular doctor, without keeping them waiting for longer than is acceptable. (At busy times, there may be ten or more patients queuing for an appointment, either on the telephone or in person.) It is difficult, under such conditions, to keep track of the order in which patients present themselves, by telephone or in person, so receptionists can become confused and either have to interrupt and recommence appointment-booking, keeping patients waiting longer than is acceptable or fail to respect the ‘first-come-first-served’ principle underlying ‘fair’ queuing.

Domain models can also be implicit in the transcribed videotapes of MR observational studies (Hill, Long, Smith and Whitefield, 1995). Consider this example of protocol data from the latter study:

Telephone 1 : flash

Receptionist 1 : pick up receiver Telephone
: (over telephone) say: “Can I help”
: pick up green card (?baby registration card) from hatch
: read green card : put green card on desk top
: (over telephone) say: “No, I’m sorry I’ve got nothing”.

Nurse : look at nurse appointment book.

Receptionist 1 : (over telephone) say: “Did you say it’s an eye infection?”
: (over telephone) say: “I’ve got literally nothing”.
: (over telephone) say: “All I can offer you is 11:45 this morning – an emergency appointment”.

Receptionist 2 : search prescription – box

Receptionist 1 : (over telephone) say: “Or 10 past 10 with Dr. J tomorrow morning”.

Receptionist 2 : take out prescription

Receptionist 1 : (over telephone) say: “O.K., your name again?”
: write in appointment book
: (over telephone) say: “O.K., 11:45 Dr. I”.
: (over telephone) say: “Thank you. ‘Bye”.
: replace receiver Telephone 1.
: (to hatch) say: “Have you not got a card when you registered the baby

On the basis of MR documentation and the protocol data, the following set of user requirements might be expressed by a receptionist domain expert: “Booking appointments works well, if only one patient is waiting and there are no interruptions. If many patients are waiting, on the telephone and at the hatch (that is, in person), or following an interruption by patient, nurse, doctor or other receptionist, we are often confused as to where we are in the booking procedure. We often take patients out of turn or have to start booking again. This results in us having to work harder than is reasonable and to keep the patients waiting longer than is acceptable to them. Further, we often fail to meet the desires of patients as concerns the particular doctor, the time, or both booked for the appointment. We need some computer help to deal with waiting patients and interruptions, so that our work and the patient waiting times are reduced to acceptable levels”.

It should be noted that neither the MR documentation, the protocol data, nor the receptionist domain expert refer explicitly to the domain, as conceived by Dowell and Long. However, their conception can be used to make explicit the domain, implicit in the description of worksystem behaviours.

For example, following Hill et al (1995), ‘suitability-of-appointment-for-patient’ could be conceived as one attribute of a ‘medical case object’, having values of ‘suitable’ or ‘unsuitable’. For the protocol data, the patient would obviously have preferred an appointment with Dr. J. rather than Dr. I., although an earlier appointment with the latter (for an eye infection) was preferable in this instance to a later appointment with the former. Likewise, an expedition-of-appointment-for-patient’ attribute might have the values ‘timely’ or ‘untimely’. The interruption of a nurse or doctor would have delayed the booking of the appointment described by the protocol data, so rendering it untimely.

Such an informal interpretation of the implicit MR model could be used to support design (and evaluation) in terms of design problem diagnosis and design solution prescription. The design problem and solution would be expressed in terms of the task quality of the MR domain of work.

As concerns design problem diagnosis, the earlier receptionist user requirements identify both acceptable user waiting times and meeting patients’ doctor preferences as candidate problems. In terms of the implicit MR domain model, the design problem could be expressed as the undesired MR task quality of (doctor) suitability and appointment expedition. The problem could be quantified, for example, appointment-booking achieving 90 per cent (doctor) suitability and 90 per cent expedition timelines.

As concerns design problem prescription, computer-based decision support might be provided for the receptionist. The support could record the order of patient arrival and store patients’ doctor preferences, using the latter to prompt preferred appointments. The additional computer support might be expected to increase MR task quality of (doctor) suitability and expedition to a desired level.

The illustration shows how the domain approach goes beyond usability (workload). The latter is implicated in the receptionist user requirements (“This results in us having to work too hard…”). It is also implicated in the additional computer decision support. However, in neither case does usability express MR task quality. (Doctor) suitability and appointment expedition express how well the MR work is performed and not the usability/workload of the MR system performing the work that well. The two concepts, of course, may be intimately related.

3.1.2 Domain of Military Command and Control

The domain of military command and control (C2) carries out two types of work – planning for armed conflict and operating for armed conflict. Nation states pursue their interests by the use of their resources, both human and nonhuman. Users may be political, military, diplomatic, etc. Resources include land, sea, air, space, installations, etc. Domain models of C2 are implicit in military history, doctrine, manuals, domain expert descriptions and military operating procedures. This illustration of military C2 uses the Vincennes incident as described in the official report of the associated enquiry (USDOD, 1988).

The incident took place during the Iran/Iraq war, initially a land battle. Iraq attempted to disrupt Iran’s oil trade, launching air attacks against its oil installations. In response, Iran disrupted oil transport in the Persian Gulf. The US response was to send naval forces to ensure oil supplies. The Iraqi Air Force successfully attacked Iranian forces near the North Persian Gulf. Iranian retaliation of small boat attacks on commercial shipping was expected. The incident occurred when the USS Vincennes – a cruiser – mistakenly shot down civilian Iran Air Flight 655, with total loss of life, while simultaneously engaging a group of Iranian small boats.

The domain model of military C2 is implicit in the expert military descriptions, which appear in the official report. Colbert and Long (1995) summarised the events of the incident as they occur in the report descriptions. Here is a selective extract (the time of the event precedes the events):

0620 The small boats and the Vincennes begin to close ———-.

0647 Flight 655 takes off and is detected by the Vincennes as an ‘unknown, presumed enemy’.

0649 Flight 655 adopts its flight path, which is towards the Vincennes. The Vincennes challenges its air contact (actually Flight 655), but receives no reply. For a moment, Vincennes’ air contact appears electronically to identify itself as a military aircraft (due to freak weather conditions?).

0651 One of the Vincennes’ guns jams when one of the small boats is about to adopt a dangerous position ———. Further challenges to the air contact receive no reply —–. Flight 655’s altitude is misread. The air contact is perceived as diving towards the Vincennes; in fact, it is climbing away from it.

0654 Two surface to air missiles are launched by the Vincennes ——. The missiles destroy Flight 655. All passengers and crew are killed.

The summary does not refer explicitly to the domain of C2, as conceived by Dowell and Long. However, their conception can be used to make the implicit domain model explicit. For example, following Colbert and Long (1995), armed conflict objects could be conceived as constituted of: friends; hostiles; and neutrals objects. All three objects have attributes of vulnerability and involvement. In addition, the friends object has the attribute of power (to realise its interest) and the hostiles object has the attribute of threat (to frustrate the friend’s interest). Attributes could have the values of low, medium and high (or indeed any finer gradations, as required). The C2 expert descriptions of the report can be re-expressed to render the domain model explicit as follows:

0620 The involvement, power and vulnerability of the friend (Vincennes) and the hostile increase again.

0647 The involvement of the neutral (Flight 655) and the friend begins to increase.

0649 The involvement and vulnerability of the neutral, and the friend’s power with respect to it, increases rapidly.

0651 The power of the friend with respect to the neutral temporarily falls sharply——. The involvement and vulnerability of the neutral and the friend’s power with respect to it continue to increase rapidly.

0654 The friend’s power with respect to the neutral is realised, with catastrophic results.

The design problem, as concerns task quality, can be diagnosed as the involvement of the neutral (Flight 655) and the realisation of the friend’s (Vincennes’) power with respect to the neutral. Both the latter’s involvement and the friend’s power with respect to the neutral have undesired high values. As a result, task quality is not as desired, and so constitutes a design problem.

As part of the prescription of a possible design solution, air contact altitude might be more clearly presented, for example, using colour or three-dimensional coding. The Vincennes would then have perceived Flight 655 as climbing away from it and not diving towards it. As a result, the neutral’s involvement would have decreased and the friend’s power would not have been realised. Neutral involvement and friend’s power values, and so task quality, would be as desired.

The illustration shows how the domain approach goes beyond usability. The latter might well be implicated in the Vincennes’ misreading of Flight 655’s altitude, but altitude is co-extensive with neither neutral’s involvement nor friend’s power realisation. Usability tells us how easy or difficult it is to perform work, not how well the work is performed.

3.1.3 Domain of Domestic Energy Management

Energy has multiple uses in the home – heating, cooking, lighting, etc. A major use – in the UK at least – is the heating of the house to maintain the comfort of the people living there. Management consists of the planning and control of the heating to ensure people’s comfort. Management is typically effected by ‘central’ heating systems, consisting of a boiler, which heats water, a tank, which stores it and pipes, which conduct the water to the radiators. Users interact with a controller to set and to programme the heating. The hot water heats the radiators, which in turn heat the house, which determines the comfort of the occupants.

Elsewhere (Stork, Middlemass and Long, 1995), a set of user requirements was elicited for such a central heating system. They are as follows: “The domestic routine of A occasionally requires him to remain at home to work in the mornings, rather than at his office. However, if A leaves after 8 o’clock in the morning, or stays at home to work, then the house is too cold until he turns the gas-powered central heating back on. If he expects to be at home a short time after 8 o’clock, then he often uses the one-hour boost facility on the heating controller to turn the heating back on, which can result in him being too cold, if he is at home longer than expected. A’s ability to work is adversely affected by being cold and having to control the heating. A finds it difficult to plan much in advance, either whether he is staying at home to work, or if he stays, how long he will stay to work. The current gas bill is acceptable, and an increase could be tolerated, although a decrease would be desirable”.

The user requirements make some explicit reference to the domain, for example, “the house is too cold” and “him being too cold”. The references are, however, incomplete. Following Stork and Long (1994), a more complete and explicit domain model might have two main physical objects: A and the house. A has a physical attribute of temperature and an abstract attribute of comfort. The attribute of comfort is related to the attribute of temperature, having a range of acceptable temperatures (for example, 35.75c to 37.5c).

The second physical object is the house, which has physical objects that are the rooms. The rooms have a physical attribute of their temperature and physical objects as radiators. The radiators have a physical attribute of their temperatures. The temperature is related to the temperature of A (an approximately linear relationship) and the temperature of the radiators.

In terms of this explicit domain model, a design problem exists, because the current values of the temperatures of the radiators result in the value of the comfort attribute of A being ‘not comfortable’, at some times, that is, task quality is too low. A design solution prescribed by Stork et al, (1995) involved programming the heating to be on for weekday mornings with an additional remote-heating controller, having an advance button and a bright status light, installed by the front door. The temperatures of the radiators and rooms, and so the house and A, were maintained, such that A’s value remained comfortable, that is, task quality was as desired.

The illustration shows how the domain approach goes beyond usability. Usability/workload is implicated in the user requirements. For example, “A finds it difficult to plan much in advance” and “A’s ability is affected by —— having to control the heating”. However, these aspects do not refer to A’s comfort, the maintenance of which, as desired, constitutes the work of the domestic energy work-system. The design solution likewise. However, the two aspects of performance, and so effectiveness, although related, are differentiated in the domain approach. The rationale is that any design change, such as the addition of a front door controller might affect usability/workload and task quality differentially.

The illustration of how implicit domain models, such as medical reception, military command and control and domestic energy management, when informed by the explicit domain approach of Dowell and Long, can inform design as the diagnosis of design problems of task quality and the prescription of design solutions for task quality, are now complete.

In all cases, the illustrations show how the domain approach goes beyond usability, and given the range of applications, beyond the usability of Web pages. Usability/workload expresses the performance of the work-system; task quality the performance of the work. Together they constitute effectiveness. We turn next to consider how explicit domain models go beyond usability.

3.2 Explicit Domain Model Illustrations

As indicated earlier, applying the domain conception of Dowell and Long necessarily produces an explicit model of the domain of work. Explicit models have also been referenced (Hill et al., 1995; Colbert and Long, 1995; and Stork et al, 1995). The illustrations, which follow show how explicit models support design and in so doing, go beyond usability.

3.2.1 Domain of Amphibious Landing Off-load Planning

As indicated earlier, military C2 carries out planning for armed conflict. Amphibious landing off-load planning is one such type of planning. An amphibious landing may here be considered an attack (that is, armed conflict) against a potentially hostile shore, launched from the sea, and involving air, sea and land forces. It includes the movement ashore of a landing force, embarked on transport ships and naval vessels, by means of amphibious vehicles, landing craft, and helicopters. The landing force arrives ready for combat ashore, and at beaches and landing zones (rather than ports or airfields) (Evans; 1990). Critical for the landing are the off-load plans (see Figure 1), which represents a typical traditional off-load plan as a table (upper figure) and as a gantt chart (lower figure). These plans specify who is to go ashore (left-hand columns of the table) and where and when (right-hand columns of the table). They also specify who is to take the landing force ashore and how the force is to be grouped tactically (middle columns) .

Figure 1. Typical traditional off-load plan
as a table (upper figure) and as a gantt chart (lower figure)

An explicit domain model of amphibious offload planning (following Colbert and Long, 1995 and Long, 2000) is shown in Figure 2. The plan has both physical and abstract aspects, the latter represented at different levels of description. Plan object attributes comprise: scope; content; and view. Scope has time and object values. Content has the values of conflict object goal states etc, view has the values of table, gantt chart etc.
Figure 2 Domain model of amphibious off-load planning

Following Long (2000), decision problems, such as off-load planning, require three decision types: solution selection; solution construction; and problem elaboration. Traditional off-load plans (Figure 2) provide only limited support for these decision types. Colbert and Long (1995), using the explicit domain models of armed conflict and off-load planning, diagnosed a design problem, indicating the task quality of such traditional off-load plans are not as desired. The design problem identified, as undesired, aspects of both plan content and plan scope. Concerning plan content, there was a failure to achieve 100% availability by the specified deadlines and rates of lift in terms of man/hours. Concerning plan scope, there were too many errors, related to time and object values.

Figure 3 Re-designed interface for amphibious off-load planning

Again, using the domain models of armed conflict and off-load planning, Colbert and Long proposed a re-designed interface, constituting the prescription of a design solution (shown in Figure 3). The interface not only shows the off-load plan to date, but also: next load pending; next load options; and next load assessments. The next load option exploits the domain model of off-load planning, for example, contents. The next load assessment exploits the domain model of armed conflict, for example, power, safety, cohesion and fatigue and the domain model of off-load planning, for example, lift. The redesign was intended to provide improved support for plan solution selection, solution construction and problem elaboration and so to achieve desired content deadlines, and lift and a reduction in scope errors.

Colbert and Long conducted an evaluation of the re-designed off-load planning interface. The results were as follows. Content deadlines were not met as desired, 91.5% not 100% of content was made available by the deadline. The planned rate of lift, however, was as desired – the 267 men/hour achieved fell between the desired criteria of 255 -275 men/hour. Likewise, scope errors of 0.4 fell below the desired criteria of 1.8. We can conclude on this basis that off-load plan quality was as desired, except as concerns content availability by specified deadline. Further redesign would be required to meet this performance criterion.

However, what this illustration is intended to show is not the success of the redesign itself, but that the domain approach goes beyond usability and that explicit domain models have the potential to support design. First, off-load task quality was expressed as desired content availability and rate of lift, together with scope errors. Elsewhere, Colbert and Long evaluated usability/workload in addition to task quality. Users’ rating of workload was an average 3.0, which was less than the acceptable criterion of 3.3, and so as desired. Thus, the redesign affected task quality and user costs differentially, so illustrating how the domain approach goes beyond usability. Both are required to express work-system performance and so to support design. Second, both domain models of armed conflict and of off-load planning contributed to the design by suggesting object/attribute/values, such as plan content and scope. Such aspects undoubtedly also contributed to the effects on the performance of task quality and user costs/workload.

3.2.2 Domain of Emergency Management

As in many countries, the UK has a system for the co-ordination of the emergency services in response to disasters, such as explosions, airplane crashes, etc. This system is called the Emergency Management Combined Response System (EMCRS). This system manages, that is, plans and controls, agencies, such as Fire and Police, when they respond to disasters. The UK EMCRS was set up to support better coordination between agencies.

The EMCRS is a command and control system with a three-level structure – operational, tactical and strategic. The system has objectives (embodied in plans), common to all agencies. These objectives are (in descending order of priority): to save life; to prevent escalation of the disaster; to relieve suffering; to safeguard the environment; to protect property; to facilitate criminal investigation and judicial, public, technical or other enquiries; and to restore normality as soon as possible (Home Office, 1994). The individual agencies relate their own individual plans by means of the shared EMCRS plans to interact effectively. Each agency plan specifies a set of functions, for example: Fire Service – rescuing trapped casualties; preventing escalation of the disaster, etc.

An explicit domain model of EMCR (following Hill and Long, 1996 and 2001) is shown in Figure 4. The model comprises physical and abstract domain objects, having attributes and values. For example, the lives object has a survival/evacuee status attribute, having the values of: rescued and not rescued. The disaster scene object has a site attribute, having the values of preserved and unpreserved. And the disaster character object has a fire-type status attribute, having the values of controlled and uncontrolled. The transformation of the object/attribute/values by the EMCRS determines the values of the stability and normality attributes of the disaster object. These values constitute EMCR task quality.

Hill and Long conducted an observational study, involving two stagings of an inter-agency combined response training scenario, which took place at the Home Office Emergency Planning College. The 60 trainees were members of the emergency services and local authority emergency planning officers. The training scenario concerned the derailment of an oil-tanker train on a bridge over a busy main road and a market. The data from the study were used to construct the explicit EMCR domain model. The data can also be used to diagnose a design problem and to prescribe a design solution to illustrate the contribution of the domain model to design.

Figure 4 Domain model of emergency management combined response system

One design problem derived from a behaviour conflict between the Police, Fire and Ambulance Services. The conflict concerned the relations between the Services’ trampling of the disaster site, and the preserved/unpreserved values of the site attribute of the disaster scene object (see earlier and Figure 4). In the training scenario, the Police declare the site a crime scene as vandalism is suspected to be the cause of the tanker-train derailment. As a result of the declaration, the Fire and Ambulance Services are required, by the EMCRS plan, not to trample the scene in the course of their functions (see earlier), as any evidence of a crime must be preserved “to facilitate criminal investigation” (see EMCRS objectives earlier). However, not trampling the site, that is, moving more carefully and so more slowly, delays the rescue of casualties by the Ambulance Service (a primary function) and hinders prevention of escalation of fires by the Fire Service (also a primary function). The Police Service behaviours of preserving the site of the disaster scene conflict with the Ambulance Service behaviours of casualty rescue, and the Fire Service behaviours of fire containment. In terms of EMCR task quality, there is an undesired survivor/evacuee attribute value of not rescued and an undesired fire-type attribute of uncontrolled. Task quality is not as desired and so constitutes a design problem.

The training data do not make clear, whether this design problem is related to planning or control, training or selection of service personnel and so do not suggest a specific design solution prescription. However, it is plausible to conceive that, in any re-design of the EMCRS plans, the disaster scene object site attribute could take on values of high, medium and low preservation. In the training scenario, the train and the track might be adjudged high preservation, the rest of the bridge medium preservation and the main road low preservation value. The additional trampling gradation would be expected to increase the survivor/evacuee attribute value of not rescued for the Ambulance Service and the fire-type attribute of uncontrolled for the Fire Service. The gradation, however, would also be expected to reduce the facilitation of criminal investigation. In other words, the original desired task quality would not be achieved by the re-design, but the task quality would be superior to the actual task quality, observed in the training scenario.

The additional gradation of trampling might make the EMCRS plan more or less usable, implicating more or less workload for the Police, Ambulance and Fire Services. Be that as it may, the illustration makes clear how the EMCR task quality of the domain approach, expressed as survivor/evacuee rescued and the management combined response system fire-type controlled, goes beyond usability and EMCR systems go beyond web pages.

Figure 5. Domain model of reconstructed air traffic management

3.2.3 Domain of Air Traffic Management

Air traffic management (ATM) is here understood as the planning and control of air traffic. Operational ATM manages air traffic, for example, Manchester Ringway Control Centre in the UK. The Centre manages a terminal manoeuvring area with 9 beacons, more than 2 airways, 1 stack, and 2 exits. Its traffic can be characterised as: departing; arriving, overflying; “low and slow”; high-level bunching and so forth. The management involves track and vertical separation rules (Dowell, 1993 and 1998). Planning is supported by “flight strips” and controlling by radar.

Dowell developed an explicit domain model of Manchester Ringway ATM and a simulation thereof – reconstructed air traffic management (RATM) (Dowell, 1993 and 1998). The model, following Long and Timmer (2001), appears in Figure 5. The model comprises physical and abstract objects. Airspace objects include beacons, for example, Alpha, Beta, etc. Aircraft objects include aircraft, for example, TAW, etc. The intersection of airspace objects and aircraft objects results in air traffic event objects with attributes of: position; altitude; speed; heading; and time. Transformation of the values of these attributes result in air traffic vector object attributes of safety, with values of flying time and vertical separation; and expedition, with values of flight progress; fuel use; manoeuvres; and exit variation. Safety and expedition express ATM task quality. Timmer and Long (1996 and 1997) conducted an observational study, using the RATM simulation. An extract from their data, involving the aircraft TAW is shown in Figure 6. The data can be used to diagnose design problems. For example, in the case of TAW, the data indicate its initial state to achieve desired task quality, the separation goal of false (that is, there is desired separation). However, following an operator intervention to improve fuel use, by changing TAW’s altitude (fuel use decreases with increases in altitude), TAW’s actual task quality becomes less than desired, the separation goal false being actually 1830 (that is, less than desired separation). This difference between actual and desired RATM task quality, constitutes the basis for an instance of a RATM design problem.

A further instance of a design problem, diagnosed by means of the domain model (in conjunction with a RATM worksystem model, proposed by Timmer and Long (2002)) appears in Figure 7. The RATM data have been collapsed to highlight the design problem diagnosis. The final state of the data shows actual performance to be less than desired Figure 5. Domain model of reconstructed air traffic management. performance. ZEN is safe, but unexpeditious (excessive progress and fuel use). The likely reason is that earlier, ZEN was expeditious, but unsafe (see domain model attribute values of Figure 4). The operator intervenes to speed up ZEN, so making it safe. However, the operator fails to update the flight progress slip, which continues to show ZEN at its cruising speed of 720. Exceeding the cruising speed results in excessive progress and fuel use. The operator’s earlier recognition of ZEN as an active, safe and unexpeditious (as concerns speed) aircraft, but subsequent failure to increase to slow down ZEN, is assumed to be associated with the decay of the unexpeditious category in memory and the flight progress strip showing (incorrectly) ZEN to be flying at cruising speed.

Figure 6 Reconstructed air traffic management data showing
Domain model attribute states and RATM Worksystem behaviours

Although Timmer and Long do not prescribe a design solution in detail, they do speculate about RATM re-design options. For example, a procedure, requiring immediate flight progress strip update, might solve the design problem. Alternatively, or in addition, an explicit display by the RATM interface of speed, progress and fuel use, which link position, altitude, speed, heading and time attributes to safety and expedition (see Figure 4) might also solve the design problem. If so, RATM actual performance, as concerns task quality, would equal desired performance.

In both cases, the new flight strips procedure and the re-designed interface would have implications for the usability/workload of the RATM system/operator. However, any such implications are not co-extensive with RATM task quality as expedition and safety. The illustration, thus, makes clear how the domain approach, expressed as task quality, goes beyond usability and how the explicit domain model can support design.

Figure 7 Reconstructed Air Traffic Management design problem (collapsed data)

4. Discussion and Conclusions

Both implicit and explicit domain models constitute HCI design knowledge. Such knowledge is intended to support both design and research. HCI design, as shown by the earlier illustrations, can be conceived as the diagnosis of design problems and the prescription of design solutions (Long, 2001). HCI research can be conceived as the acquisition and validation of design knowledge (Long, 2001). Implicit and explicit models are now assessed for their potential to support design and research.

The illustrations from the domains of medical reception, military command and control and domestic energy management all suggest implicit domain models have potential to support design as both diagnosis and prescription. In contrast, these implicit models would appear to have little potential for informing research, other than as a point of departure. Validation of the design knowledge requires its conceptualisation, operationalisation, test and generalisation (Long, 2001). Since these models are implicit, they are not (explicitly) conceptualised. Hence, they cannot be operationalised, tested and generalised, and so validated.

The illustrations from the domains of off-load planning, emergency management and air traffic management all suggest explicit models have potential to support design as diagnosis and prescription. Note, however, that these explicit domain models express only task quality, that is, performance. They do not describe the worksystem behaviours, neither those of the user nor of the computer, determining performance. Additional models are required to represent the latter. The explicit models would also appear to have the potential for informing research

As the domain concepts are explicit, the models meet the requirement of research for conceptualisation. Since conceptualised, the models offer the potential for operationalisation, test and generalisation, and so validation. It is concluded that both implicit and explicit domain models of work have the potential to support design practice as diagnosis and prescription. However, the nature of the support is different. Implicit models support (re-)design of the instance (that is, the particular work system, for example, the medical prescription system of Drs. I. and J. (see earlier) ). In contrast, explicit domain models express the instance as a member of the class (that is, the domain model of air traffic management of which Manchester Ringway is an example (see earlier) . Explicit model support for design is thus more than implicit model support. In contrast, explicit model support is more limited, as it expresses only performance (as task quality) and not the worksystem itself. As concerns research, only explicit domain models have the potential to provide support, because implicit models cannot be validated, in the absence of conceptualisation. Implicit domain models, however, can be the starting point for explicit domain models, developed by research.

It is concluded, generally, that implicit domain models are needed in the shorter term, to inform design of individual work systems. Explicit domain modelling is in its infancy. However, explicit domain models offer the possibility of validation by research and so a better guarantee, in the longer term, of support for both design and research. The importance of explicit domain models needs to be underlined and its further development encouraged. However, both implicit and explicit domain models can be considered as part of the domain approach to HCI. This approach accepts that usability and Web pages are rightly both central to HCI at this time. However, task quality – how well the work is done, derived from domain models, that is, performance, goes beyond usability, conceived either as a property of the computer or as the property of the user, that is, workload. The domain approach also goes beyond the new technology of Web pages. The Web is essentially an information provider with design problems and solutions, concerning access, navigation, etc. It also currently offers, in addition, a range of services, for example, ecommerce, insurance, banking, holiday and flight booking, etc. However, work such as air traffic management, emergency management and amphibious off-load planning are well beyond its current capability. It is concluded, then, that the domain approach goes beyond the usability of Web pages – the thesis of this paper. Finally, the arguments and the illustrations suggest that the domain approach shows promise in its support for an engineering HCI discipline, whose aim is design for effectiveness and whose design knowledge offers a better guarantee of its application.

Acknowledgements:

For conceptual foundations – John Dowell. For the research – all first authors of illustrations. For paper realisation – Doris; Natalie; Rhunette; Beverley; Minna and Jacky.

4. References

BEKKER, M.M. & LONG, J. User Involvement in the Design of Human-Computer Interactions: Some Similarities and Differences between Design Approaches. In Proceedings of the British Computer Society Human-Computer Interaction Specialist Group, 2000.

CARD, S.K., MORAN, T.P. & NEWELL, A., The Psychology of Human-Computer Interaction, New Jersey, Erlbaum, 1983.

COLBERT, M. & LONG, J. Towards the Development of Classes of Interaction: Initial Illustration with reference to Off-Load Planning. In Behaviour & Information Technology, 15 (3), 149-181, 1995.

Home Office. Dealing with Disaster, 2nd ed. Home Office Publication, London, HMSO, 1994.

DOWELL, J. Cognitive engineering and the rationalisation of the flight strip. Unpublished PhD Thesis, University of London, 1993.

DOWELL, J. Formulating the design problem of air traffic management. International Journal of Human-Computer Studies, 49, 743-766, 1998.

DOWELL, J. & LONG, J. Target Paper: Conception of the Cognitive Engineering Design Problem. Ergonomics, 41 (2), 126-139, 1998.

DOWELL, J. & LONG, J.B. Towards a conception for an engineering discipline of human factors. Ergonomics, 32, 1513-35, 1989.

DRURY, M. The medical secretary’s and receptionist’s handbook, 4th ed. London, Bailliere Tindall, 1981. EVANS, M.H.H. Amphibious operations. London, Brassey’s, 1990.

GREEN, W. S., & JORDAN, P. W. Pleasure with products: Beyond usability. London, Taylor & Francis, 2002.

HILL, B. & LONG, J. A Preliminary Model of the Planning and Control of the Combined Response to Disaster. In Proceedings of ECCE 8, 57-62, 1996.

HILL, B. & LONG, J. Performance Modelling in the Domain of Emergency Management. In M.A. Hanson, (Ed.). Contemporary Ergonomics 2001, 165 – 170. London, Taylor & Francis, 2001.

HILL, B., LONG, J., SMITH, W. & WHITEFIELD, A. A Model of Medical Reception – The Planning and Control of Multiple Task Work. Applied Cognitive Psychology, 9 (1), 81-114, 1995.

JEFFREYS, M. & SACHS, H. Rethinking General Practice: Dilemmas in Primary Health Care. London, Tavistock, 1983.

LIGHT, A. & WAKEMAN, J. Beyond the interface: users’ perceptions of interaction and audience on websites. Interacting with Computers, 13(3), 325-351, 2001.

LONG, J. Specifying Relations between Research and the Design of Human-Computer Interactions. International Journal of Human- Computer Studies, 44 (6), 875-920, 1996.

LONG, J. Cognitive Ergonomics: some lessons learned (some remaining). In M.A. Hanson, (Ed.), Contemporary Ergonomics 2001, 263 – 271. London, Taylor and Francis, 200.

LONG, J. & TIMMER, P. Design problems for cognitive ergonomics research: What we can learn from ATM-like micro-worlds. Le Travail Humain, 64(3), 197-222, 2001.

LONG, J.B. Domain Approach for Decision Support for Planning and Control: a Case-Study of Amphibious Landing Off-Load Planning. In Proceedings of APCHI 2000, 2000.

NEWMAN, W. A preliminary analysis of the products of HCI research, using pro forma abstracts. In Proceedings of CHI ’94, 278-284, 1994.

STORK, A. & LONG, J.B. A Specific Planning and Control Design Problem in the Home: Rationale and a Case Study. In Proceedings of the International Working Conference on Home-Oriented Informatics, Telematics and Automation, 1994.

STORK, A., MIDDLEMASS, J. & LONG, J. Applying a Structured Method for Usability Engineering to Domestic Energy Management User Requirements: a Successful Case Study. In Proceedings of HCI’95, 367-385, 1995.

TIMMER, P. & LONG, J. Expressing the effectiveness of planning horizons. Le Travail Humain, 65(2), 103-126, 2002.

TIMMER, P. & LONG, J. Integrating Domain and Worksystem Models: An Illustration from Air Traffic Management. In Proceedings of the Conference on Domain Knowledge in Interactive System Design, 194-207, 1996.

TIMMER, P. & LONG, J. Separating User Knowledge of Domain and Device: A Framework. In Proceedings of HCI’97, 379- 395, 1997.

Towards a Conception of HCI Engineering Design https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g June 10, 2011 October 30, 2012

Towards a Conception of HCI Engineering Design

John

June 10, 2011

S. Cummaford and J. Long

Ergonomics & HCI Unit, University College London, 26 Bedford Way,
London, WC1H 0AP

ABSTRACT

Current HCI design knowledge is generally not well specified and thus not validatable. There is a need for more formal design knowledge, which can be validated, such that guarantees may be developed. This need would be met by Engineering Design Principles (EDPs). EDPs support the specification then implementation of a class of design solution for a class of design problem within the scope of the EDP. A conception of the general EDP (GEDP) is proposed here, illustrated with reference to internet-based transaction systems. The GEDP is derived from the conception of the general design problem of an engineering discipline of HCI, and the general design solution, as conceptualised here. This conception of the GEDP, it is argued, is sufficiently formal to support the initial operationalisation of class-level design problems to support the development of class-level EDPs. A strategy for developing class-level design problems is proposed and illustrated with reference to transaction systems. This strategy appears promising for the development of class-level EDPs, supported by empirical guarantees.

Keywords Engineering, design principles, conception, human-computer interaction, performance guarantees, internet transaction systems.

NEED FOR HCI ENGINEERING DESIGN PRINCIPLES

Current best practice in HCI design has produced many technologies that interact with the user to perform effective work. However, the knowledge applied in the design of these technologies is all-toooften not explicitly stated and so not formally conceptualised, although it may be successfully operationalised by designers. Reliance on such ‘craft’ skills militates against the identification, and so the validation, of successful design knowledge and, as a result, its take-up and re-use. The lack of validation and the consequent ineffective development of design knowledge thus leads to slow and inefficient HCI discipline progress (Long 1996).

There is a need for more formal HCI design knowledge, that is, whose conception is coherent, complete and fit-for-purpose, such that guarantees may be developed and ascribed. HCI Engineering Design Principles (EDPs) would meet this need by establishing these guarantees on the basis of analytic and empirical testing, leading to their validation. EDPs are explicit, and so validatable, prescriptions of substantive and methodological design knowledge which, when applied to a design problem within the scope of the principle, would support the specification then implementation of a design solution with guarantee (Dowell & Long, 1989). The development of such knowledge would thus increment the knowledge of an engineering discipline of HCI and would be fit-for-purpose, by providing support with a better guarantee for the practices of solving HCI design problems.

The benefits of such EDPs would be numerous. By employing design knowledge, which has already been shown to support the development of successful design solutions to design problems of a similar type, the need for iterative system development would be reduced. The first iteration would benefit from previous solutions. The re-use of such design knowledge would thus reduce the development time for a technology for which a principle had been formulated. Consequently, the cost of iterative usability testing would be reduced. Furthermore, the structuring of EDPs, at varying levels of generality, would support the re-use of successful design knowledge in new design problems, providing these problems could be characterised similarly at a general level. EDPs would also facilitate design knowledge organisation, by offering a structure with which to taxonomise acquired design knowledge, relating to classes of design problem.

This paper seeks to inform EDP development by conceptualising design knowledge sufficient to prescribe a general design solution (GDS) for a general design problem (GDP). Conceptualisation of such design knowledge, once the general EDP (GEDP) has been established as applicable, is required by EDP development to make explicit the concepts, which need to be instantiated as class-level EDPs (CEDPs). The process of EDP validation comprises four stages: conceptualisation; operationalisation; test; and generalisation (Long, 1996). These four stages support the development of formal, and so validatable, HCI engineering design knowledge. Conceptualisation supports the identification of promising knowledge, which guides instantiation of the GEDP, to produce a CEDP. Instantiated CEDPs may then be operationalised, tested and generalised. These four stages of validation support the ascription of performance guarantees.

This paper begins with an expression of the GDP of an engineering discipline of HCI, as proposed by Dowell and Long (1989; 1998), which is then used to inform the conceptualisation of the GDS, the components of the GEDP and the relationships between the GDP, the GEDP and the GDS. GEDP concepts are illustrated with respect to internet-based transaction processing systems (transaction systems). These transaction systems are in widespread use in electronic commerce, e.g. the order collation and payment system in an internet bookshop. The second section addresses CEDP development issues. Two contrastive strategies are presented, the ‘initial instance’ strategy (Stork & Long, 1994) and the ‘initial class’ strategy, as proposed here. These strategies are assessed and the ‘initial class’ strategy is developed by consideration of class-level design problems (CDPs) and an approach to developing CDPs to support the development of CEDPs. CDPs are illustrated with respect to transaction systems.

PRACTICES SUPPORTED BY EDPS

Long and Dowell (1989) characterise disciplines as comprising: a general problem with a particular scope; practices which provide solutions to the general problem; and knowledge, which supports those practices. EDPs would be the knowledge of an engineering discipline of HCI. They argue that disciplines, and so knowledge, may be characterised by the completeness with which solutions are specified, supporting the practices of implement and test, if incomplete; or specify then implement, if complete. EDPs seek to support the practices of specify then implement by employing formal, and so validatable, design knowledge, which offers complete, coherent, prescriptive design support. The efficacy of prescriptive design knowledge may be seen in more mature engineering disciplines (such as electrical engineering), in which discipline knowledge supports the complete and coherent specification of design solutions prior to implementation.

CONCEPTION OF THE RELATIONS BETWEEN GDP, GEDP AND GDS

To support the development of such EDPs, Dowell & Long (1989) proposed a conception of the GDP of an engineering discipline of HCI in terms of: work; the interactive worksystem; and performance, as task quality and worksystem costs. This conception of the GDP is summarised below to inform the conception of the GDS and GEDP as proposed here. By conceptualising the relations between the GDP, the GEDP and the GDS, coherence and completeness may be assessed. The conception of the GDS is proposed here in terms of: work; the interactive worksystem; and performance, as task quality and worksystem costs. The concepts of the GDP are thus recruited into the conception of the GDS. The concepts of the GDP form criteria for the success of the GDS; use of the same concepts supports assessment of the success of the GDS in satisfying the criteria in the GDP. The conception of the GEDP of an engineering discipline of HCI is proposed here in terms of: scope, comprising a class of users, a class of computers and a class of achievable performances; substantive component; methodological component; and guarantees. These relationships between the concepts of the GEDP and their relationship to concepts contained in the GDP and GDS are discussed later. The discussion of these relationships supports the conceptualisation of complete and coherent relations between the conceptions of the GDP, GEDP and GDS of an engineering discipline of HCI.

GENERAL DESIGN PROBLEM

A design problem 1 expresses an inequality between actual performance (Pa) and desired performance (Pd) of some interactive worksystem (i.e. Pa >Pd with respect to some domain; a successful design solution specifies some interactive worksystem (hereafter worksystem) which achieves the desired performance (Pa = Pd) with respect to some domain. Worksystems comprise users and computers, both of which have structures supporting behaviours. The desired performance is expressed as work, achieved to a desired level of task quality, whilst incurring an acceptable level of costs to the worksystem. Work is expressed as transformations of the attribute values of objects in the domain of the worksystem. These domain transformations (Dowell & Long) inform the conception of the GDS and GEDP and are in italics on first exposition. Engineering Design Principles are achieved at some desired level of task quality (Tq), whilst incurring some acceptable level of costs to the user (Uc) and the computer (Cc). Attributes are features of domain objects, which afford transformation by the worksystem. The goals of the worksystem are defined as a product goal, which is a transformation of object attribute values. Realisation of a product goal may involve the transformation of many attributes and their values, these transformations being termed task goals. Thus, a product goal may be re-expressed as a task goal structure, which specifies the order and relations between a number of task goals, sufficient to achieve the product goal. As more than one task goal structure may be sufficient to achieve a product goal, it is necessary to distinguish between alternative task goal structures in terms of task quality. Task quality describes the difference between the product goal and the actual transformation specified by a task goal structure. This concept supports evaluation of alternative structures of this type (Dowell & Long, 1989). The worksystem comprises one or more users interacting with one or more computers, each of which is characterised by structures which support behaviours. Desired performance is thus effected by a particular class of user and computer structures, supporting behaviours, which achieve domain transformations, whilst incurring some acceptable level of costs. Worksystem structures are necessary to support behaviour, e.g. knowledge of financial transactions is necessary to support transacting behaviours. Worksystem behaviours involve the transformation of object attributes and their values, e.g. transferring ownership of goods from the vendor to the customer in transaction processing may be expressed as transforming the attribute ‘owner’ from value ‘vendor’ to value ‘customer’ for domain object ‘book x’.

SUBSTANTIVE AND METHODOLOGICAL EDPS

Dowell and Long assert that EDPs may be either substantive or methodological. They state that substantive EDPs “prescribe the features and properties of artefacts, or systems that will, constitute an optimal design solution to a general design problem.” Methodological EDPs “prescribe the methods for solving a general design problem optimally.” (Dowell & Long, 1989). In the conception of EDPs proposed here, substantive and methodological knowledge is assumed to be unitary. The issue of whether optimal solutions are commensurate with EDPs is not addressed, as the guarantees ascribed here would be derived from empirical testing. Thus, these guarantees cannot be claimed optimal, but empirically established.

NEED FOR CONCEPTION OF GDS AND GEDP

The Dowell and Long conception supports the development of EDPs by offering a coherent and complete expression of the GDP to be addressed by the GEDP. However, the relationship between the concepts of the GDP and the concepts of the GEDP are not formally conceptualised. Furthermore, the conception of the GDS is implicit. Thus, the relationship between the GEDP and the GDS is not formally conceptualised. The GDS is conceptualised below, as required to inform the development of EDPs.

CONCEPTION OF GDS

A design solution 2 contains the specification of a worksystem for which the actual performance equals the desired performance (i.e. Pa = Pd), as stated in the design problem. Worksystems comprising users and computers are conceptualised as structures, which support behaviours, which interact to perform work in a domain. Work is expressed as transformations of object attribute values to achieve task goals, which comprise a task goal structure. The quality with which the task goal structure achieves the product goal specified in the GDP is expressed as Tq and the costs incurred by the users and computers are expressed as Uc and Cc.

CONCEPTION OF GEDP

A conception of the GEDP may be considered complete, if its expression is sufficient to identify the applicability of the GEDP, in terms of the GDP. It may be considered coherent, if its expression is sufficient to prescribe design knowledge for specifying the GDS. The ascription of performance guarantees must also be explicitly conceptualised for coherence. The conception of EDPs proposed here includes the concepts of: scope; comprising a class of users, a class of computers and a class of achievable performances; substantive component; methodological component; and performance guarantees. This conception of the GEDP 3, it is argued, is sufficiently coherent and complete to support the initial operationalisation, test and generalisation of EDPs, and so is potentially fit-forpurpose. The concepts of the GEDP are formally conceptualised at a level commensurate with the conception of the GDP. This conception of the GEDP supports carry-forward of coherent and complete design knowledge by supporting the expression of design knowledge, at the appropriate level of generality. This knowledge supports the operationalisation of CEDPs, and its success determines whether it is fit-for-purpose. CEDPs formally specify the relationship between a CDP and a corresponding CDS. The concept of classes supports the representation of design knowledge at different levels of generality. These classes are 2 Concepts from Dowell & Long which have been recruited into the conception of the GDS and the GEDP are in bold italics on first exposition.

Concepts which are novel to this paper are in bold on first exposition. Cummaford and Long 82 identified by reference to the scope of CEDPs – class hierarchies are not intended to constitute a taxonomy of all possible CDPs, but rather only of those CDPs for which a CDS exists. The ultimate success of a CEDP is measured by the performance achieved by the specific technologies supported by its application. The associated guarantees are based on the empirical testing of a series of instances of CEDP application. The coherent and complete conceptualisation which guides this operationalisation may thus be assessed for fitnessfor- purpose. The concepts of the GEDP are informed by the concepts of the GDP and GDS, as the purpose of the GEDP is to identify its applicability to the GDP and prescription of the GDS, if applicable. The GEDP supports the prescription of a GDS, which achieves the desired performance stated in the GDP, if identified as applicable. Scope of the GEDP Specifying criteria for identifying design problems, to which an EDP may be applied, ensures that the knowledge is applied only to those design problems for which it supports the specification of a design solution. Design problems contain not less than one or more users, interacting with not less than one or more computers, and some desired performance. The scope of the GEDP thus comprises a class of users (U-class), a class of computers (C-class) and a class of achievable performances (P-class). If the user and computer in the design problem are members of U-class and C-class respectively, and the desired performance stated in the design problem is a member of P-class, then a design solution would be produced and the actual performance of the solution would equal the desired performance stated in the design problem. The relationship between Uclass, C-class and P-class is developed by empirical testing of the implemented design solutions, produced by CEDP operationalisation. If the user, computer or the desired performance are outside the scope of the principle, then there is no guarantee that the design solution may be specified then implemented. For transaction systems, the criteria for establishing U-class membership would establish the minimum structures and behaviours required for some user, in conjunction with some member of C-class, to achieve a performance which is a member of P-class. Such structures might include knowledge of financial transactions with card-based payment technologies. Supported behaviours might include matching goods descriptions to their shopping goals. The criteria for C-class membership might include structures such as a Virtual Shopping Cart, and supported behaviours, such as real-time processing of payments via the internet. P-class would specify the product goal, e.g. support the exchange of resources for currency, which could be achieved by members of U-class and C-class, to a desired level of task quality, whilst incurring an acceptable level of costs.

Substantive and Methodological Components of GEDP

EDPs contain substantive and methodological design knowledge which may be applied to any design problem within the scope of the EDP. The substantive component is characterised by the conceptualisation of user and computer structures and behaviours, comprising the worksystem, which are present in some instance of the class of users (Uclass) or class of computers (C-class) respectively. The methodological component supports the conceptualisation of a task goal structure, comprising task goals, to be effected by the worksystem, which achieves the product goal stated in P-class. The product goal specifies the work to be effected in the domain by the worksystem, in terms of object attribute value transformations. The structures and behaviours specified in the substantive component are sufficient to achieve the task goal structure specified in the methodological component to an acceptable level of task quality, whilst incurring some acceptable level of costs; where task quality and worksystem costs are members of P-class. This sufficiency is supported by empirical testing of a CEDP, which indicates whether the GEDP is fit-for-purpose. Support for the conceptualisation of user and computer structures and behaviours may take the form of models of interaction between user and computer, expressed as structures supporting behaviours. In the case of transaction systems, a mercantile model of the stages of a transaction to support the specification of behaviours, sufficient to achieve the required domain transformations, would constitute candidate substantive knowledge. One such model (Kalakota & Whinston, 1996) characterises a transaction as: prepurchase determination (information seeking); purchase (agreement of a contract for exchange); and postpurchase interaction (exchange, and evaluation of the product). This model might indicate that information seeking behaviours are necessary to complete a transaction, these behaviours being supported by structures, e.g., purchasing goals, hands. Summary of GEDP Conception This paper proposes a conception of EDPs within which guarantees may be developed for formal HCI engineering design knowledge. The conception proposed thus far comprises the following concepts and relationships: For any design problem {user, computer, Pd} and any EDP {U-class, C-class, P-class, substantive component, methodological component} If user is a member of U-class and computer is a member of C-class, then user structures and behaviours, and computer structures and behaviours, stated in the substantive component are present. If user structures and behaviours and computer structures and behaviours specified by the substantive component are present then the task goal structure specified by the methodological component is achievable. If the task goal structure specified in the methodological component is effected by a worksystem comprising the structures and behaviours specified in the substantive component then the product goal will be achieved, task quality will be x, user costs will be y and computer costs will be z. If task quality x, user costs y and computer costs z are achieved, Then Pa = Pd. Therefore, Pd is a member of P-class for a worksystem comprising instances of U-class and C-class. This conception may be said to be coherent, as it is based on two relationships: the relationship between the task goal structure and Tq for some product goal, and the relationship between the worksystem structures and behaviours, sufficient to achieve this task goal structure, and Uc and Cc. These relationships may be said to be coherent, as performance is a function of the efficacy with which some task goal structures are achieved by some worksystem structures and behaviours. The GEDP conception may be considered complete, as the concepts of the conception of the engineering discipline of HCI, which inform its development, appear within it. The issue of fitness-for-purpose will be addressed via operationalisation of the conception of the GEDP to inform the development of CEDPs, which may then be tested and generalised.

Validation and Ascription of Guarantees to EDPs

Operationalisation of the GEDP as CEDPs supports empirical testing of the class-level design solutions prescribed. This testing establishes whether the GEDP is fit-for-purpose, that is, it supports the specification then implementation of a design solution which achieves the desired level of performance stated in the design problem. The fourth stage of validation, generalisation, involves establishing the generality of the CEDP. These four stages of validation support the ascription of a guarantee that a worksystem, which performs the task goal structure specified in the methodological component of the EDP, achieves a level of Tq within the P-class stated in the EDP. A second guarantee, that the substantive component supports the specification of a worksystem, which exhibits the structures and behaviours sufficient to achieve the task goal structure, specified in the methodological component, whilst incurring a level of costs within the P-class stated in the EDP, may then be ascribed. A third guarantee, that correct application of the EDP to a design problem, within its scope supports the specification then implementation of a design solution which achieves Pd, is then ascribed on the basis of the former guarantees and further empirical testing. EDPs thus support the specification then implementation of a design solution which achieves the desired performance, if the design problem is within the scope of the EDP.

STRATEGY FOR CEDP DEVELOPMENT

Stork & Long (1994) have applied the conception of HCI to establish a basis for developing EDPs. They have operationalised the general HCI design problem by metricating the concepts, of which it is comprised, to express a specific design problem (SDP) in the domain of domestic energy management. Metrication provides observable and measurable criteria against which to assess performance. The design solutions (SDSs) of such specific design problems and the abstraction of prescriptive knowledge would constitute an EDP – the goal of Stork and Long. However, the operationalisation of an SDP per se does not ensure that a CDP, of which the SDP is an instance, will be found, other than by the assumption of a single domain. This strategy may be termed the ‘initial instance’ strategy, as it seeks to develop CEDPs by specifying design knowledge for SDPs, by means of SDSs, and then generalising across instances. This approach may be contrasted with an alternative ‘initial class’ strategy for CEDP development, as proposed here. The ‘initial class’ strategy supports CEDP development by constructing solutions to CDPs and then construing relevant design knowledge. Because this knowledge is construed at the class level, it is promising for CEDP development. The development of CDPs prior to operationalisation is therefore desirable, as this development constrains the DPs operationalised to those which offer promise in supporting the identification of class-level knowledge. A class may be considered promising for development, if an SDS exists and there are SDPs which share features of the solved SDP. Once such an initial class hypothesis has been formulated, the viability of the class may be assessed by examination of the work performed and the worksystem structures and behaviours sufficient to achieve Pd. If the performance achieved by the worksystem (Pa), specified in the SDS, is similar to the Pd of other SDPs (i.e. Pa = Pd), then the SDPs show promise for CDP development. Cummaford and Long 84

Method for CDP development

The first phase of the ‘initial class’ strategy for CDP development involves identification of an SDP and a corresponding SDS. The second stage involves identifying further SDPs which require a similar Pd, which supports specification of P-class. The user(s) and computer(s) which are to achieve P-class are then assessed for similarities. They may be considered similar, if the user(s) and computer(s), specified in each SDP, comprise sufficient structures supporting sufficient behaviours to achieve P-class. If this sufficiency holds, these user(s) and computer(s) form U-class and C-class of the CDP respectively. In practice, once P-class has been specified, developing CDPs involves identifying U-class and testing instances (members) of this class interacting with instances of C-class. These instances of U-class and C-class are then used to inform the development of a CDS, which achieves P-class. The level of generality should be considered prior to development. Classes which contain very few instances low in the hierarchy contain design knowledge which is very specific. The costs of developing a class at a given level of generality should be balanced against the number of instances to which it may be applied successfully.

Candidate CDPs: Internet-Based Transaction Systems

A class of transaction system design problems has been identified and is presented to illustrate the concept of CDPs. This parent class has three instances (subclasses), each of which is also a class. Each subclass is characterised by P-class, to be achieved by U-class interacting with C-class with respect to some domain. The general characteristics of each of these CDPs are inherited from the parent class. The subclasses of transaction system CDP are: (homogeneous) physical goods (e.g. books); information (e.g. online newspapers); and banking and finance (e.g. loans). These subclasses are abstractions over SDPs, e.g. a design problem concerning a transaction system to support the effective exchange of books for currency in an internet bookshop is an instance of the class of (homogeneous) physical goods. Subclasses may be distinguished by P-class. Differences in the product goal required for each of these subclasses have been identified for physical goods and information sub-classes (Hallam-Baker, 1997). In addition to these two classes, a banking CDP has been developed. The classes differ in the nature of the resources exchanged, the immediacy of the exchange, the possibility for reversing the transaction and the potential loss to the vendor. These differences in product goal indicate that the CDSs for these subclasses will specify classes of worksystem with different task goal structures, as the respective worksystems perform different work. It should be noted that candidate CDPs are identified on the basis of differences in the domain objects transformed by the respective worksystems. Operationalisation of CDPs will support the assessment of whether such differences will result in different worksystem specifications. Each CDS specified is then assessed to establish the task goal structure, sufficient to achieve a level of Tq within P-class. This sufficiency informs the development of the methodological component of the corresponding CEDP. The worksystem structures and behaviours sufficient to achieve the task goal structure, whilst incurring a level of Uc and Cc within P-class, are then construed. These structures and behaviours inform the development of the substantive component of the CEDP. Validation and the ascription of guarantees are based on subsequent operationalisations of the conceptualised CEDPs.

FUTURE RESEARCH

The ‘initial class’ strategy will be operationalised, resulting in CDPs and CDSs for the three transaction system subclasses identified. These CDPs and CDSs will be used to inform CEDP development. The resulting CEDPs will be operationalised and tested to inform the development of guarantees. If these stages of validation are successful, the CEDPs will be generalised and the CEDPs may be considered valid. Abstraction of these subclass CEDPs will be used to produce the parent CEDP, which will then be operationalised, tested and generalised.

ACKNOWLEDGEMENTS

This research associated with this paper was carried out under an EPSRC studentship.

REFERENCES

Dowell, J. and Long, J. B. (1989) Towards a conception for an engineering discipline of human factors. Ergonomics 32, 1513-1536.

Dowell, J. and Long, J. B. (1998) Conception of the cognitive engineering design problem. Ergonomics 41, 2, 126-139.

Hallam-Baker, P. M. (1997) User Interface Requirements for Sale of Goods. Available from: http://www.w3.org/ECommerce/interface.html

Kalakota, R. and Whinston, A. B. (1996) Frontiers of Electronic Commerce. Reading, Mass: Addison-Wesley.

Long, J. B. (1996) Specifying relations between research and the design of human-computer interactions. International Journal of Human Computer Studies, 44, 6, 875-920.

Long, J. B. and Dowell, J. (1989) Conceptions of the discipline of HCI: craft, applied science, and engineering. in Sutcliffe A. and Macaulay L., (Eds.) Proceedings of the Fifth Conference of the BCS HCI SG. Cambridge: Cambridge University Press.

Stork, A. and Long, J. B. (1994) A specific planning and control design problem in the home: rationale and a case study. in Proceedings of the International Working Conference on Home- Oriented Informatics, Telematics and Automation. University of Copenhagen, Denmark. 419-428.

Validating Effective Design Knowledge for Re-Use: HCI Engineering Design Principles https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g June 10, 2011 October 30, 2012

Validating Effective Design Knowledge for Re-Use: HCI Engineering Design Principles

John

June 10, 2011

Stephen Cummaford

Ergonomics & HCI Unit, University College London, 26 Bedford Way, London, WC1H OAP

ABSTRACT

There is a need for more formal HCI design knowledge, such that effective design knowledge may be specified in a format which facilitates re-use. A conception of Engineering Design Principles (EDPs) is presented, as a framework within which to systematically relate design knowledge to performance. It is argued that the specification of these relations supports validation, leading to a higher likelihood that application of an EDP to an appropriate design problem will result in a satisfactory design solution. A hierarchy of classes of design problem is presented, and discussed in context of the ongoing research project.

Keywords Cognitive engineering, design, knowledge re-use, performance, principles, validation.

INTRODUCTION

Human-computer interaction practitioners have designed many effective technologies. However, the knowledge applied during development of a successful design solution is not often stated explicitly. Such ‘craft’ skills are difficult to represent explicitly, and as such, cannot easily be tested, and so validated [4]. Such under-specification limits the possibility of successful communication, and so re-use, of effective design knowledge. Engineering Design Principles (EDPs) benefit designers by facilitating more complete communication of designers’ knowledge. Design knowledge applied to produce a successful solution to a design problem can be carried forward to solve similar design problems. Validated design knowledge allows designers to generate new solutions with less prototyping and testing, thus reducing system development costs.

ENGINEERING DESIGN PRINCIPLES

The concept of specifying a complete, class-level, design solution was presented in the Engineering Conception of HCI (HCIe), which was developed within the Ergonomics & HCI Unit, UCL. [3]. The HCIe conception characterises the worksystem, tasks, and interaction structure in terms which support expression of performance, in terms of task quality achieved, and costs incurred whilst performing the work. The focus of my thesis is to develop coherent representations of the relationships between worksystem, tasks, interaction structure, and performance. Such representations support validation of the relationships between a design solution and performance. This conception of principles, as relating to specific classes of design problem, allows design knowledge to be represented at varying levels of generality, as classes may contain instances which are themselves classes.

EDP components

The EDP conception relates the elements of the HCIe conception, by specifying design representations which allow the relationship between the design solution and performance to be measured. An EDP consists of related representations which support coherent specification of a class-level effective design solution. Examples for each concept are provided, from a recent e-store transaction systems case-study, in { } brackets.

Scope: identifies the class of design problems for which this EDP has been validated {physical goods transaction systems supporting order collation and payment, for use by general public without specific training}.
Product goal: the work which is to be done {online order collation and payment }.
Domain model: contains objects which are transformed by the worksystem. Objects have attributes {book, has an owner} and the product goal is expressed in terms of object-attribute value transformations {change book owner from ‘vendor’ to ‘customer’ }.
Task-goal structure: specifies the interactive behaviours to be performed by the worksystem as a hierarchical box diagram, which decomposes the work to be done into behaviour primitives. These behaviours achieve the product goal.
User and computer models: specify the user and computer structures and behaviours which are sufficient to perform the task-goal structure {knowledge of credit card use, shipping address; realtime credit card transaction processing}. These models are used to determine costs incurred by the worksystem, an aspect of performance.
Achievable task quality: statement of how well the work was achieved by previous artefacts designed with this EDP, an aspect of performance { in empirical test, 100% of users completed transaction, no set-up or training required for users with 6+ months internet experience }. Cost matrix an expression of the costs incurred by the worksystem whilst performing the task-goal structure. It is constructed by listing the elements of the user model on the y axis {read, calculate, working memory} and the subtasks of the product goal on the x axis { order item, enter payment details}. The cost matrix is used to record the number of times each user model element is activated during each task. This is assessed empirically, by observation of user trials, and so measures actual user costs, rather than ideal performance. The cost matrix has proved useful in systematic comparison of competing design solutions [2].
Validation: The EDP conception supports expression of design knowledge by specifying an effective design solution at some level of generality. Performance is measured empirically, by testing with implemented design solutions specified by an EDP. Establishing performance supports assessment of whether the design solution is actually effective, rather the extent of its functionality [5]. The concepts contained in the EDP conception, it is argued [1], support validation by evaluation of the relations between performance and worksystem components performing a task-goal structure.

CLASSES OF DESIGN PROBLEM

A class of transaction system design problems has been identified to inform the development of EDPs. A transaction system is the order collation and payment component of an internet store. The parent class has three instances (subclasses), each of which is also a class. The subclasses of transaction system design problem are: physical goods (e.g. books); online services (e.g. video on demand); and financial (e.g. loans). A case study, addressing the design of physical goods transaction systems, has been conducted. This case study involved specification of a class-level design problem, by abstraction over the requirements for several e-stores, and specification of a design solution, which was then implemented and tested. There were particularly high costs associated with finding the total price for goods, including shipping, in the existing system. This was because the user must enter address and credit card details before the price was calculated. The re-designed system featured two popup menus to select a country and a shipping option, after which the total price could be displayed throughout the interaction, thus reducing user costs. Further details from this case-study are shown on the accompanying poster.

FUTURE RESEARCH

The next stage of the research project is to develop class-based knowledge for the subclass of online services transaction systems. The class hierarchy hypothesised earlier will then be evaluated. The hypothesised class hierarchy will be partially validated, if commonalities are found between the first and EDPs, such that design knowledge may be abstracted and represented at the general class level. The abstracted, class-level knowledge will then be validated by application to further design problems which are instances of the third subclass. The final stage of the project will be to address the usability of the design representations, such that the benefits of using the EDP framework are realised, whilst incurring an acceptable cost of use to designers.

ACKNOWLEDGEMENTS

I would like to thank my supervisor, Professor John Long, for his enthusiastic and insightful input throughout this project. This work was funded by the UK Engineering and Physical Research Council.

REFERENCES

[1] Cumrnaford and Long (1998) Towards a conception of HCI engineering design principles, in T.R.G. Green, L. Bannon, C.P. Warren & J. Buckley (eds.) Proceedings of ECCE-9, the Ninth European Conference on Cognitive Ergonomics. EACE, pp. 79-84.

[2] Cummaford, S. and Long, J. B. (1999) Costs Matrix: systematic comparison of competing design solutions, in S. Brewster, A. Cawsey & G. Cockton (eds.), Proceedings of Human-Computer Interaction INTERACT ’99 Volume 2, IOS Press, pp.25-26.

[3] Dowell, J. and Long, J. B. (1989) Towards a conception for an engineering discipline of human factors. Ergonomics 32, 1513-1536

[4] Long, J. B. (1996) Specifying relations between research and the design of human-computer interactions. International Journal of Human Computer Studies, 44, 6, 875-920.

[5] Newman, W. M. (1997) Better or Just Different? On the benefits of designing interactive systems in terms of critical parameters, in G. C. van der Veer, A. Henderson & S. Coles (eds.), Proceedings of the Symposium on Designing Interactive Systems (DIS ’97), ACM Press, pp.239-245.

Solving class design problems: towards developing Engineering Design Principles https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g June 10, 2011 February 5, 2014

Solving class design problems: towards developing Engineering Design Principles

John

June 10, 2011

Steve Cummaford and John Long

Ergonomics & HCI Unit, University College London, 26 Bedford Way, London WC1H 0AP

ABSTRACT

Current HCI design knowledge is rarely specified adequately to support its validation. However, such validation is important, since it offers promise for the re-use of design knowledge, to solve design problems, similar to previously solved problems. This paper proposes a strategy for the development of HCI design knowledge, leading to validation. This knowledge is to be expressed ultimately as HCI Engineering Design Principles, which comprise validated knowledge, supported by performance guarantees. The strategy initially requires the specification of class design problems, and their solutions. A partial operationalisation of the strategy, for e-commerce transaction systems, is reported. The operationalisation resulted in the specification of a class design problem and solution, which are exemplified in this paper. The specification will be developed in future research, to construct Engineering Design Principles.

KEYWORDS

HCI, Engineering, Design knowledge, Validation, Engineering Design Principles

RESEARCH NEED

Validated Human-Computer Interaction (HCI) design knowledge offers the benefit of reducing system development costs for the solution of design problems, similar to previously solved problems. However, the validation of HCI design knowledge has been limited, if compared with related disciplines, e.g. Software Engineering (Sutcliffe, 2002). It has been argued that this deficit is due to the lack of (explicit) conceptualisation of design knowledge, to solve design problems. This under-specification limits testing, and so validation (Long 1996).

This paper describes the initial and partial operationalisation of a research strategy, intended more adequately to specify HCI design knowledge, to support validation. Dowell & Long (1989) state that “established engineering disciplines possess formal knowledge: a corpus of operationalised, tested and generalised principles. Those principles are prescriptive, enabling the complete specification of [classes of] design solutions before those designs are implemented.” To support validation, design knowledge must be expressed explicitly (conceptualised), to enable its application to design (operationalised), so its relationship with interactive system performance can be established (tested). On the basis of this relationship, the class of design problems for which this performance is achieved can be specified (generalised) (Dowell & Long 1989).

The conception of Engineering Design Principles (EDPs), proposed in Cummaford & Long (1998), specifies the knowledge representations, sufficient to support development and validation of EDPs. This conception is shown in Figure 1.

Figure 1: EDP conception

The conception comprises models, required by HCI as an engineering discipline (HCIe), following Dowell & Long (1989). The conception expresses the HCIe general design problem in terms of a domain model; a user model and a computer model, which together comprise the interactive worksystem (IWS) model; and a statement of inequality between actual and desired performance (Pa ≠ Pd). The domain model comprises objects, which have attributes having particular values. The work to be performed is expressed as domain object attribute value transformations, collectively termed the product goal. The user and computer are characterised by structures and behaviours. These structures are physical, and abstract, and they support behaviours, which are physical, and abstract, and effect transformations in the domain. Performance is expressed as task quality (Tq), (i.e. how well the work is effected) and IWS costs (i.e. the workload incurred in effecting the work). IWS costs comprise user costs (Uc) and computer costs (Cc), and can be physical and mental.

The EDP conception specifies relations between its representations, which can be validated. First, the user and computer models contain sufficient structures, to support behaviours, which carry out the task-goal structure (TGS), whilst incurring IWS costs. Second, the task goal structure, achieves the product goal, for some Tq. The EDP conception thus relates statements of performance to the models, supporting operationalisation, and so, validation. EDPs comprise conceptualised, operationalised, tested and generalised design knowledge, which supports the diagnosis and prescription of a class of design solutions (CDS), which satisfies a class of design problems (CDP). Thus, the specification of design problems and design solutions, at the class level, is a pre-requisite, and so, necessary process of EDP development. To support the ultimate specification of EDPs, there is a need iteratively to identify CDPs and their CDSs, and extract both commonalities between them, and between the commonalities themselves. The latter constitute an EDP, which applies to all design problems within its scope. EDPs offer the possibility to specify, then implement design solutions to ‘hard’ (determinate) design problems.

EDPs may be contrasted with HCI design patterns (Borchers 2001), which focus on a part of a design solution, rather than the complete solution. A complete solution offers more promise for performance guarantees, as the application scope is more constrained.

In this paper, the process of CDS development is reported, to illustrate the application of an EDP development strategy. The result, comprising class representations of the user, computer, domain and TGS for the CDP and CDS, is exemplified (due to space limitations) by the single task ‘find total price’.

DEVELOPMENT STRATEGY

Two strategies for EDP development were described in Cummaford & Long (1998). The ‘initial instance’ strategy (Stork & Long, 1994) generates specific design solutions to specific design problems, then abstracts commonalities, between a problem and its solution, then between these commonalities. This approach is ‘bottom up’. This strategy was contrasted with the ‘initial class’ strategy, proposed by Cummaford & Long, in which ‘promising’ candidate CDPs are identified, by abstraction of commonalities from (constructed) design problems, and corresponding CDSs constructed (where possible). This strategy thus constrains EDP development effort by focusing on domains, which offer ‘promise’ for class design knowledge (although there is no guarantee). This approach is ‘top down’.

SELECTION OF PROMISING CLASS

This research conducted a review of four e-shops, from a wider range, to evaluate similarities in product goal, users and devices, comprising the IWS, and Pa and Pd. All shops support a similar product goal, namely, the exchange of goods for funds, when certain criteria apply. The user groups appear also to be similar, as members of the general public, having knowledge of using websites, and making payments with credit cards (or similar payment technologies). The e-shops sell different goods (e.g. books, CDs, tea, etc.), but comprise similar structures, e.g. ‘virtual shopping cart’, which support similar behaviours, e.g. ‘display subtotal’. Pd is also assumed to be similar for the shops, i.e. to enable all users (fulfilling the criteria), to complete a purchase (high Tq), with minimum workload (low Uc). The similarities between the product goal, IWS, and Pd, for the e-shops reviewed, are considered sufficient to offer promise for the development of a CDP and CDS (and so ultimately an EDP) for e-shop transaction systems.

PROCESS

The following process specified the CDP and CDS. The stages are shown in Figure 2. Of the 6 SDPs represented, two e-shops were reviewed, for which SDPs were specified, the remaining two e-shops were only reviewed (i.e. as potential SDPs), and other e-shops were available for review more generally (i.e. 5 …n). CDP and CDS specification involves first specifying SDPs [1], by testing existing systems and identifying instances for which Pa ≠ Pd. The CDP is then specified, by abstracting commonalities from them [2]. The resulting model is evaluated, by checking that the class user and computer models can be operationalised to complete the TGS for each SDP [3]. Existing HCI design methods are applied to the CDP, to specify a CDS [4]. To evaluate Pa of the CDS, it is decomposed into SDSs, to enable testing by the target user group [5]. Pa is abstracted from the SDSs’ performances [6]. A complete CDP and CDS were specified by the present research, using this process. The complete CDP and CDS models however, are not presented here, due to space limitations. However, complete models, relating to a particular goal of the TGS (‘find total price’), are presented, to exemplify the process completely. The exemplification demonstrates how the models are related to performance, as required by the CDP and the CDS. The requirement of the EDP conception in what follows, for each stage of the process, precedes exemplification by the models of the partial operationalisation of the EDP strategy.

Figure 2: CDP & CDS specification process

1: Specify SDPs

RequirementFirst, a promising class of systems is identified, on the basis of (informal) similarities in work performed. Second, examples of such systems are selected for testing. If these systems achieve a desired level of performance, they constitute specific design solutions (SDSs). If Pa ≠ Pd, they constitute specific design problems (SDPs). The SDP representations specify the domain objects, sufficient to characterise the work performed by the IWS, to achieve the product goal, by operationalisation of the TGS. The IWS, comprising user and computer models, specifies the structures, sufficient to support behaviours, to achieve the TGS. Performance, expressed as Tq and Uc and Cc, is established empirically.

InstantiationIn this research, of the four e-shops reviewed, two – Amazon.com and Barnes & Noble.com, were selected for testing. The e-shops achieved similar product goals, but exhibited different behaviours. The testing required the users to attempt to complete a range of tasks. The latter ensured that relevant aspects of the transaction were evaluated, e.g. removing items from the order, as well as adding them. Actual Tq was lower than desired, as not all users completed the tasks. Uc were inferred from user behaviours[1] and verbal protocols, and enumerated in a Costs Matrix (Cummaford & Long 1999). The SDPs are not included here, due to space limitations, but differences between the SDPs and the CDP are reported later.

Requirement

Following SDPs specification, commonalities are abstracted to construct the CDP. This abstraction comprises common aspects of the SDPs, to provide an initial CDP expression. The domain model is also abstracted, to express the product goal in terms of domain transformations. The TGS is then similarly abstracted. The IWS model is likewise abstracted from the SDPs. As class users cannot be tested as such, Pa for the CDP is derived from the SDPs tested.

Instantiation

Here, the CDP domain model was abstracted from the SDP domain models tested. The domain transformations to achieve the product goal were then specified. The product goal is presented here as text, with an example of its domain transformations. The TGS, to achieve the product goal was then specified, by abstraction from the SDP TGSs. The U-class and C-class, comprising the IWS, were then specified, to contain the minimal set of behaviours to enable the domain transformations to effect the TGS. Performance was then derived by calculating the mean performances of the SDPs tested.

Domain model

Only those components of the domain model, sufficient to specify the total price of the goods ordered, are presented here (see earlier). Domain objects (e.g. ‘user’) have affordant attributes (e.g. total price), the values of which are transformed by the IWS. Dispositional attributes comprise information, which is needed, in order to complete the product goal (e.g. user location), but their values are not transformed.

The domain model was the same for the two tested SDPs, and so was synthesised as the CDP. However, the reviewed e-shops sold different goods (e.g. books, CDs, tea, etc.). The goods were therefore specified generally as ‘homogeneous physical goods, of which multiple units of multiple items are typically ordered’. This specification constrains instances of the CDP to transaction systems, which support collation of an order, containing multiple items, and which incur shipping (i.e. delivery) price, as part of the total transaction price.

Figure 3: Partial domain model for transaction processing

Product goal

The product goal comprises tasks, which must be supported, under specific conditions. Here, the design solution must likewise support the exchange of goods for payment in real-time. In addition to enabling transactions, the system must support the user in gathering pricing information to inform later purchasing decisions. The product goal for the class of physical goods transaction systems is specified as:

Transfer ownership rights of goods from vendor to customer
Transfer goods & shipping price from customer account to vendor account
Transport goods from current location to customer’s address

when the following conditions hold:

User desires to complete a transaction in response to offer from vendor
User has sufficient funds in appropriate format
User address is in legal domain of contract for vendor

In the CDP, the product goal is expressed in terms of transformations of domain object attribute values, which are presented using the syntax ‘domain_object/attribute [value]’. For example, the first statement in the product goal is expressed as:

Transform item/owner from [vendor] to [customer] for all products for which product/ordered is [true].

The relationship between domain transformations, and IWS behaviours to effect these transformations, is represented in the TGS.

Task-goal structure

Here, the TGS was abstracted from the TGSs of the two SDPs tested. A range of shopping tasks was specified, to support comparison of costs between the CDP and CDS. These tasks included ordering single and multiple units of goods, deleting goods from an order, finding total price, and completing the transaction. Only Task 4 in the TGS is shown, as the exemplification of the CDP models, establishing performance. Task 4 requires the domain transformation: ‘Transform user/total price from [unknown] to [amount]’.

Figure 4: Task-goal structure: Task 4

User model

Here, the user model (U-class) was abstracted from the user models in the two SDPs tested. The mental behaviours were specified by positing the minimal set of behaviours, sufficient to effect the TGS, to achieve the product goal. User behaviours were specified analytically, and then evaluated, using verbal protocols from the SDP tests. Here, only user behaviours, sufficient to achieve Task 4 of the TGS, are specified.

Figure 5: Partial user model

Computer model

The computer model was abstracted from the computer models of the two SDPs tested, as for the user model. The computer model contains the minimal set of behaviours, to achieve the TGS.

Figure 6: Partial computer model

Performance

Performance is abstracted from testing the SDPs, instances of the CDP. Tq is expressed as the percentage of successful attempts to complete the product goal. IWS costs are expressed as the number of behaviours performed. Costs are measured, using the Costs Matrix. Cc are not reported here, due to space limitations.

Task quality

80% of tasks were successfully completed during SDP testing. Tq is less than desired, that is, 100%.

User costs

Uc are incurred, whilst learning to achieve the product goal (‘set-up costs’), or during execution of the TGS behaviours (‘runtime costs’). Acceptable setup costs are essentially zero, as target users are able to complete a transaction with no previous training (they are specified as having used the internet previously, and having experience of using credit-card-style payment technologies). Pd is for lower runtime Uc than the SDPs. An increase in Cc is considered acceptable, if this supports a reduction in Uc, or an increase in Tq (as required earlier).

Runtime costs were calculated using the Costs Matrix, one of which was constructed for each user interacting with each SDP. The matrix was completed by giving a score of 1 for every observed or inferred user behaviour. The mean values for users were then calculated (shown in Table 1). Costs support diagnosis of ineffectiveness in the SDPs, and enable comparison with potential design solutions. The tasks in Table 1 constitute the TGS to achieve the product goal. Similar tasks were used during evaluation of the CDS, to support direct comparison of performance. Tasks 1-3 involved ordering goods; Tasks 4 & 6 were ‘find total price’; Task 5 was ‘delete items’; and Task 7 was ‘complete transaction’.

Task:	1	2	3	4	5	6	7	Total
Encode	3.5	4	6.5	4	2.5	3	1
Execute	10	12.5	11	4	6	4	7
Total abstractbehaviour cost	13.5	16.5	17.5	8	8.5	7	8	79
Search screen	2.5	3.5	3.5	3.5	3.5	3	1
Click	3	3.5	5	13.5	1.5	5.5	1
Keystroke	4.5	1	122.5	1	12
Total physical behaviour cost	5.5	11.5	9.5	139.5	6	20.5	2	194.5

Table 1: CDP Costs Matrix

The CDP, here, comprises all the representations, specified in the EDP conception, and as such, is considered acceptable .

3: Evaluate CDP

Requirement

To ensure that the CDP is sufficient to characterise SDP behaviours, it is evaluated analytically, with respect to the TGS for each SDP. The IWS behaviours of the CDP are operationalised, to achieve the TGS, for each SDP. If the CDP IWS achieves the TGS, then the CDP is retained. If there are insufficient behaviours, then SDP / IWS differences must be re-synthesised. If they are too dissimilar to support synthesis, then a CDP cannot be abstracted.

Instantiation

The evaluated e-shops exhibited different behaviours, e.g. present subtotal on all screens, or only on the ‘virtual shopping cart’ screen. These differences resulted in a similar product goal, but with different performances. However, the aspects of the work, resulting in high user workload, were similar. They were included in the CDP. The CDP user model and domain model were operationalised analytically, to check that the TGS in each SDP could be achieved. The user model contained sufficient behaviours to achieve the TGS in each of the SDPs tested. The CDP was therefore retained.

4: Specify CDS

Requirement

Existing HCI design knowledge and methods are used to specify the CDS. The design stage specifies a TGS, which is achievable by the IWS, whilst incurring acceptable IWS costs and attaining desired Tq. Whilst performance can only be established by testing SDSs, instantiated from the CDS, analytic methods can be used to establish the likely performance of the CDS, prior to testing.

Instantiation

In this case, MUSE, a method for usability engineering (Lim & Long, 1994) was generally used, to design a CDS, which achieved Pd. Substantive HCI design knowledge was also recruited in the design. In particular, the TGS, required to effect the product goal, was informed by mercantile models, which specify transactions as having distinct phases, described as Negotiation, Agreement and Exchange (Kalakota & Whinston 1996). The outcome of the redesign was a re-engineered TGS. An example for the task ‘find total price’, is shown in Figure 7. During CDP testing, calculation of total price, during order collation, was found to be a high-cost behaviour. The CDS TGS differed from the CDP TGS (Figure 4), in that shipping options may be selected at any point in the interaction, hence supporting calculation of shipping costs by the computer, during the Negotiation Phase of the transaction.

Figure 7: Task 4 of the CDS Task-goal structure

The CDP TGS required credit card details to be entered by the user, before the computer displayed the total price, including shipping. The CDP testing indicated that this interaction reduced Tq, as some users were unwilling to enter this information, prior to knowing the total price. The CDS TGS presents the total price throughout the Negotiation Phase, enabling the user to complete a transaction, based on the total price, including shipping price (and so to achieve Tq). IWS costs, incurred whilst effecting the TGS, and the Tq achieved, were measured by testing SDSs, instantiated from the CDS. This testing is described in the next stage of the process.

5: Specify SDSs

Requirement

To evaluate the CDS, it is necessary to re-express it as specific design solutions (SDSs). As there are no class users per se, SDSs must be designed, to enable CDS testing. It is necessary to instantiate more than one SDS, to abstract CDS commonalities in performance.

Instantiation

Here, to evaluate the CDS, it was instantiated as two SDSs, to enable testing with specific users. The two SDS computer models were similar to the SDP computer models, but featured controls to allow the shipping options to be selected at any point during the Negotiation Phase. Two SDS prototypes ensured class performance.

6: Evaluate CDS

Requirement

The CDS performance is abstracted from the performances attained by each of the SDSs. If Pa = Pd, then the CDS is acceptable for the CDP.

Instantiation

Here, ten users were tested on each of the SDS prototypes. Analysis by means of the Costs Matrix indicated that user workload was acceptable in both prototypes, and all users could complete the product goal (the specified criteria having been met). The mean user costs are presented in Table 3. The costs were lower than those of users interacting with the CDP systems. Desired Tq was achieved. The CDS was therefore an acceptable design solution, for the CDP.

Task:	1	2	3	4	5	6	7	Total
Encode	3.5	4	5.5	1.5	2.5	1	4.5
Execute	13	9.5	11	2	5	2	6
Total abstractbehaviour cost	16.5	13.5	16.5	3.5	7.5	3	10.5	71
Search screen	2.5	3.5	3.5	2	3.5	2	3.5
Click	5	3.5	5	2	1.5		12.5
Keystroke		4.5	1		1		120
Total physical behaviour cost	7.5	11.5	9.5	4	6	2	136	176.5

Table 2: CDS Costs Matrix

Review

The process presented above was instantiated here, to specify a CDP and CDS for e-commerce transaction systems, as a pre-requisite for EDP development. The specification of a CDP was considered promising for the development of a CDS. This CDS was then specified, and evaluated by testing SDSs, instantiated therefrom. The results indicated that the CDS achieved Pd, and was thus was an acceptable solution to the CDP.

DISCUSSION

This research raises both general and specific issues. Specific issues will be addressed first and general issues second. The specific issues are ordered, following the process stages, shown in Figure 2.

Specifying SDPs

Four e-shops were reviewed, to inform selection of systems for SDP specification. These e-shops exhibited sufficient similarities to show promise for class development. However, the number of systems tested, to inform CDP development, was somewhat limited (i.e. two). Testing more systems would give greater confidence in abstracted similarities between instances. However, the work described here is intended to demonstrate the possibility of specifying CDPs and their CDSs, as an initial and partial operationalisation of the ‘initial class’ strategy of EDP development. Once the process of developing CDPs and their CDSs has been shown to be feasible, such additional development efforts may then be justified.

The tasks, selected for inclusion in the TGS, included a range of typical shopping activities. The tasks are sufficient to characterise any transaction performed on e-commerce sites, although the number of times each task is completed will vary for each transaction. It was, however, considered desirable to standardise the tasks here, to support systematic comparison of the SDP systems, both with respect to each other, and to the SDS prototypes. Varying sub-task frequency, however, should be considered in future work.

SDP testing involved users entering payment authorisation details, supplied by the researchers. Whilst actual payment was thus simulated, users reported that they would be uncomfortable entering their own payment authorisation, before knowing the total price payable. Tq was thus based on a simulation, with its inevitable differences with actual transactions, and so might be questioned. However, such ineffectiveness has been also shown to occur for e-shops generally (Spool 2002).

User behaviours were assumed to be equivalent in terms of workload (Uc). Whilst these behaviours undoubtedly involve user workload, the relative workload, of each, may well not be equal. Empirical evidence for differential workload between behaviours could be integrated into the Costs Matrix. Physical costs were enumerated by counting directly observable user behaviours, but abstract costs were inferred, from tasks performed by the user, and from verbal protocols. A standardised set of encoding criteria informed identification of user abstract costs, which assumed the cost of encoding each page of information to be equivalent to the cost of mental (structure) activation. The criteria, and the equivalence of costs, incurred during behaviours and activation of associated structures, could usefully be explored further in future work.

Specifying the CDP

Pd for the CDP was specified relative to the Pa of the SDSs tested (i.e. increase Tq, reduce Uc). Whilst it is possible, in principle, to state Pd as absolute values, such absolute values were not employed here. Future EDP development should support address of this issue.

Evaluating the CDP-SDP relations

Analytic evaluation of the CDP involved checking that sufficient structures and behaviours were present in the user and computer models, to perform the TGS of each SDP. This assessment supported evaluation of the CDP user and computer models. In EDPs, the CDP is used to establish its applicability, and so this process was considered necessary, in order the better to inform the development of EDP models.

Specifying the CDS

MUSE was selected, from a range of existing HCI design methods, as it supports explicit specification of user and computer behaviours, as a system and user task model, achieving domain transformations for some level of Tq, whilst incurring Uc and Cc.. These representations of IWS, work, and performance allowed the resulting MUSE models to support specification of the CDS models, with minimal re-expression. Substantive knowledge, recruited during this stage of the process, was taken from the existing literature.

Specifying the SDSs

The SDS prototypes exhibited features, which were ported from the SDP systems tested. For example, the two SDP systems exhibited different types of ‘virtual shopping cart’, aspects of which were included in the SDS prototypes. This inclusion ensured that differences in performance between the SDPs and SDSs were due to differences in the CDP and CDS, rather than to specific features of the e-shops and SDS prototypes.

Evaluating the CDS

The EDP conception, upon which this work is based, includes explicit representations of user and computer structures, which support behaviours (following the HCIe conception – see earlier). User and computer structures are not reported in this paper, due to lack of space, but structures were specified for both the user and computer models. That behaviours may be enumerated as an expression of costs, without reference to structures, suggests that structures are not a necessary component of the models, for the purpose of measuring workload, when CDP and CDS structures remain unchanged. However, structures offer a more complete characterisation of the IWS, and are necessary when CDP and CDS structures change (that is, when structure ‘set-up’ costs are not zero).

Mapping the CDP to CDS

Whilst the CDP and CDS cannot be tested directly, the differences in performances of their instantiations indicate that the CDS achieved more effective performance than the CDP. Subjective reports of workload (Likert scales of task difficulty completed by users after testing) indicate that the SDS prototypes incurred less costs in general (as perceived by users) than the SDP systems.

General issues

This initial and partial operationalisation enabled evaluation of the research process followed. The evaluation, however, suggests the need for better specification of that process, such that subsequently developed CDPs and CDSs are likely to be at the same level of generality, and are appropriate to construct EDPs. Model granularity was not an issue here, as they were specified by the same researchers. However, better specification of the process is needed, to support different researchers in specifying CDPs and CDSs at a commensurate and appropriate level of generality.

Whilst the CDP class identified appears promising, there is no guarantee that development will result in the construction of EDPs. Class problems may not have class solutions, and this relationship cannot be known in advance, that is, when the CDP is initially specified. In addition, to construct EDPs, it is necessary to have more than one CDP and CDS to abstract commonalities between these CDP/CDS pairs. In order to achieve this abstraction, further CDPs, and their CDSs, will need to be specified. The initial operationalisation of the ‘initial class’ strategy will therefore continue, with the aim of constructing EDPs.

The initial and partial operationalisation of the ‘initial class’ strategy presented here, thus, offers promise for the eventual development of EDPs, as validated HCI design knowledge, supported by performance guarantees. Whilst this goal has yet to be achieved, this initial and partial operationalisation indicates that the specification of class design problems and solutions is possible.

ACKNOWLEDGMENTS

This work was partially supported by an EPSRC research studentship.

REFERENCES

Borchers, J. (2001). A pattern approach to interaction design. John Wiley, Chichester, UK.

Cummaford, S.J.O. and Long, J.B. (1998). Towards a conception of HCI engineering design principles. Proc. ECCE-9, ed. Green et al; Limerick, Eire.

Cummaford, S.J.O. and Long, J.B. (1999). Costs matrix: Systematic comparison of competing design solutions. In Proc INTERACT 99, ed. Brewster et al; Edinburgh UK.

Dowell, J. and Long, J. (1989). Towards a conception for an engineering discipline of human factors. Ergonomics, 32, 1513-1536.

Kalakota, R. and Whinston, A.B. (1996) Frontiers of electronic commerce. Addison Wesley, Reading Mass. USA.

Lim, K.Y and Long, J.B. (1994) The Muse method for usability engineering. Cambridge University Press, Cambridge, UK

Stork, A. and Long, J.B. (1994). A specific planning and control design problem in the home: rationale and a case study. In Proceedings of the International Working Conference on Home-Oriented Informatics, Telematics and Automation. University of Copenhagen, Denmark. 419-428.

Spool, J. (2002). The customer sieve. Online report, available at world.std.com/~uieweb/Articles/customer_sieve.htm

Sutcliffe, A. 2002 On effective use and re-use of HCI knowledge. In Human – Computer Interaction in the new millennium. J.M. Carroll (Ed); Addison Wesley.

[1] Although the research specified the structures, supporting behaviours, as required by the HCIe conception (see earlier), only behaviours are reported here, due to space limitations.

John Long Comments on Festschrift Published Papers https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g May 23, 2012 September 17, 2013

John Long Comments on Festschrift Published Papers

John

May 23, 2012

John Long Comments on Festschrift Published Papers

In my own contribution to the Festschrift (Some Celebratory HCI Reflections on a Celebratory HCI Festschrift), I celebrated ‘the Festschrift papers themselves (both accepted and rejected), their authors and their reviewers’. I also went on to write: ‘My natural instinct is to peer review the papers. Space and my honoured status forbid such a review. However, I hope to do this elsewhere (I owe it to the authors and myself)’. The elsewhere has arrived and is now here.

However, on reflection, to peer review the Festschrift papers now seems inappropriate. They were peer reviewed before publication and are unlikely to be re-published, at least in their present form. Nevertheless, I would still like to respond to the papers (‘what I owe myself’) and to contribute to the ideas, expressed in them (‘what I owe the authors’). My ‘response’ and ‘contribution’, then take the form of a commentary – a set of comments on the ideas put forward by the authors. The comments are wide-ranging from simple clarifications to complex suggestions as to how the ideas might be developed further. The comments are intended to be constructive, even when critical. They are my way of expressing my thanks to the authors for their contributions to the Festschrift. I hope they find my comments both interesting and useful.

‘1979/80’ John Long https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g November 18, 2012 October 18, 2014

‘1979/80’ John Long

John

November 18, 2012

Date of ‘MSc’: ‘1979/1980’.

MSc Project Title:

None.

Pre-‘MSc’ Background:

BA Modern Languages, Cambridge (1959); manager Shell Oil International, Africa and Vietnam (9 years); BSc Psychology, Hull (1970); PhD, Cambridge (1974), then researcher, MRC/Applied Psychology Unit (9 years); Reader then Professor UCL (21 years).

Pre-‘MSc’ View of HCI/Cognitive Ergonomics:

I studied for my PhD and worked at the MRC/APU Cambridge under Donald Broadbent. I began research into HCI with colleagues in the early 1970s in collaboration with IBM. Unsurprisingly, I viewed HCI essentially as Applied Psychology. Psychology theories of HCI can be developed and then applied to solving design problems. We researchers developed the theories; practitioners were intended to solve the problems with the knowledge we acquired.

Post-‘MSc’ View of HCI/Cognitive Ergonomics:

No real change from that of before ‘MSc’. HCI is still viewed as Applied Psychology. If design problems include a physical aspect (for example fatigue or posture) Physiology and Biomechanics theories can be applied in addition to those of Psychology. The ‘new’ Cognitive Ergonomics, then, complements the ‘old’ Physical Ergonomics.

Subsequent-to-‘MSc’ View of HCI/Cognitive Ergonomics since:

My view has changed radically, during this period. I employed a number of (outstanding) engineers at this time to join me in the conduct of HCI research – John Dowell; Kee Yong Lim; Ian Salter; and Adam Stork, most of whom were MSc then PhD students. Separately and together, they forced me to make the following changes to my Applied Science view of HCI: 1. HCI Science is the understanding, that is, explanation and prediction, of HCI natural phenomena. 2. HCI Engineering is the design for performance, that is design problem diagnosis and prescription, of artefacts (see Hill (2010) and Salter (2010)). 3. Scientific theories, such as models and methods, can contribute to Engineering practice; but they need to be (re-)formulated for design purposes and validated by Engineering practice (see Denley and Long, 2010). Since the middle 80s, I have been teaching, researching and practising HCI, using an engineering approach, including the search for HCI Engineering Principles.

Additional Reflections

Directing the EU was the best job I ever had (and I have had some very good jobs).It was a great place to work (hard) and to play (hard); but very much in that order. The EU attracted some very brilliant students and staff. I sometimes wondered, who was supervising whom. I still claim that I never did any of them (permanent) mental damage. I remain grateful to them and their contributions to a very special place to work. ‘Task Quality’ was very high and so as ‘Desired’, although it has to be said that ‘User Costs’ were even higher and only just about ‘Acceptable’. I still count many as my friends and we continue to meet up yearly to remember the ‘good old EU’ and to celebrate our luck and good fortune to have been part of it.

Making our own fun at an EU party…..

JL ‘rapping’ – Ho! Ho! Ho!

Couldn’t (w)rap a cod and chips, ready to go.

1979/80: Tony Rubin https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g December 15, 2012 October 4, 2014

1979/80: Tony Rubin

John

December 15, 2012

Date of MSc: 1979/80

MSc Project Title:

Restricted Headroom and its Effects on Manual Exertion

Pre-MSc Background:

Science A Levels, Physiology Degree

Pre-MSc View of HCI/Cognitive Ergonomics:

Non-existent – I came to the MSc. Course with a strong predisposition towards the more physical and physiological aspects of Ergonomics.

Post-MSc View of HCI/Cognitive Ergonomics:

I came away from the course recognising that interaction with the ‘man-machine’ interface encompassed the cognitive as well as the physical. An appreciation of the importance of psychology in any workplace design also came about from considering some rudimentary display technologies. The impact of cognitive design considerations in control rooms, control panels and control systems was made clear from case studies of incidents such as the (then) recent 3-Mile Island nuclear accident, the Papa India air-crash and the Torrey Canyon oil spill disaster. All of these had some element of faulty or counter-intuitive user interface design that was at least partially responsible for the ensuing disaster. Since widespread personal computing was still in its infancy and the internet was still some way off, these exemplars were used as currency in discussing poor cognitive design.

Subsequent-to-MSc View of HCI/Cognitive Ergonomics:

I ended up working in the Human Factors research department at the Post Office Research Centre, which subsequently became the BT Research Labs. I then spent several years working on a variety of cognitive aspects of HCI – mainly on email and remote group working and interaction. This work culminated in a book:

‘User Interface Design for Computer Systems’, (Ellis-Horwood, 1988).

Additional Reflections:

I decided to apply for the MSc Ergonomics course quite late in the day. I had just finished my Physiology degree at Chelsea College where Rainer Goldsmith was the professor and I had been looking to get funding to study for a physiology-based PhD. As various PhD opportunities came and went during the summer of 1979, I decided that my interest in applied physiology that had been aroused at Chelsea could be usefully stretched to encompass Ergonomics so I applied for the MSc almost as a back up whilst I continued to seek an interesting and funded PhD.

Once all my PhD options had dried up I was more than happy to fully embrace the MSc Ergonomics course, which by then had offered me a place. It was John Long’s inaugural year as Director of Studies; undoubtedly our decision to ‘join’ the MSc, albeit in very different capacities, was to profoundly affect the future paths of our careers.

The following reflections are, at the time of writing, being generated some 33 years ex post facto and therefore subject to all the normal frailties of human memory and so come with the expected health warnings regarding accuracy.

All of the 10-12 students on the course were male (and Rachel in particular will testify that this set up an unusually and highly competitive course dynamic) and from a wide variety of backgrounds. True to its interdisciplinary roots the course had attracted representatives from psychology, engineering and physiology; though I think psychologists formed the largest constituency.

The course itself offered a rich selection of guest lecturers mainly, but not exclusively, from colleges within the University of London. The breadth of topics covered was in hindsight quite staggering, and for a physiologist with a negligible grounding in engineering or psychology, quite daunting. This poses an intellectual and practical question for the multi-disciplinary course designer: How much course time should be devoted to bringing all the students up to an acceptable level of knowledge in the academic disciplines that are new to them?

The variety was a real strength of the course making each week very different and it kept us students constantly on our toes not only mentally but also physically since we were forever dashing from Bedford Way to the Royal Free at Hampstead or Birkbeck or the RAE at Farnborough – there was never a dull moment.

Because personal computing was still in its infancy (there were a few Apple IIs and Commodore Pets around some university departments) the application of Ergonomics to user interface design and other computer-based tasks was brand new and not a core component of the MSc in those far-off days. The focus of the course was perhaps more centred on the physical and gross engineering design aspects of the workplace and its environment. This is no surprise but it makes one appreciate how much, in the last 30 or so years, first world countries have become knowledge-based economies with vast numbers of workers desk-bound and screen-dependent. It is only a slight exaggeration to claim that for ergonomists in1980 Etienne Grandjean and popliteal height ruled!

Because there were so few of us on the course there were ample opportunities for group discussions and because John Long was new too – he attended many of the visits with us and I recall a number of entertaining debates en route which John led or sometimes refereed.

Visits to coal mines, control rooms and military research establishments were popular features of the course and these punctuated the regular diet of guest lectures at frequent intervals. The visits were always eye-opening – most of the students were young and therefore relatively unfamiliar with the world of work and the many design challenges to be analysed and surmounted in every aspect of the work environment. Our naivety meant that every visit and every new workplace caused us to rethink and recalculate the size and scope of the opportunity for improvement based on the well-rounded multi-disciplinary approach that ergonomics offered. Most of us had yet to encounter the twin perils of corporate budget constraints and corporate inertia, which made (and probably still make) design changes to anything but the most health and safety threatening of issues a political encounter first and foremost.

In hindsight this naivety could have been anticipated and we might have benefited from an appreciation not only of the different battles that the ‘Corporate Ergonomist’ would need to fight but also the different theatres of war that we would need to be equipped to fight in. From ‘death by committee’ to corporate apathy, the working ergonomist needs a good arsenal at his disposal just to earn the right to apply the design principles and theory that were diligently acquired on the MSc. I for one started my life as a working ergonomist (or to be precise a human factors specialist) at the then Post Office Research Centre with an expectation that there was no question that any recommendations for improvements that I came up with would be hastily adopted.

It would be churlish to be overly critical of a course that was so instrumental in determining my future but away from the academic aspects of the course, I recall that in a sense we did not ‘belong’ in the same way that undergraduates ‘belong’ within the university and collegiate environment. At the time I just thought that this was how post-graduate study was supposed to be; but I see now that this was partially as a result of the multi-disciplinary and peripatetic nature of the course itself. As a body of students we did not really mix with our contemporaries on either a social or intellectual basis. Some of this was because we were completing an MSc in a single academic year and there was little time to fraternise with the wider college community, but there was also little opportunity. Maybe this was because our small designated area in Bedford Way was on the periphery of the main campus and we did not share many common areas with students of other disciplines. Whatever the reason I do not think at that time we felt part of a wider college community.

Besides the academic credentials that accrued after completing the MSc, I for one, left with an appreciation of the stimulating challenges and opportunities that a working ergonomist was going to encounter. I think that I left the course pretty well equipped to apply sound design principles and more importantly to develop both research methodologies and practical techniques that allowed me to fulfil a valuable role as an ergonomist.

Old Papers Never Die – They Only Fade Away…… https://www.hci-site-experiments.co.uk/wp-content/themes/engic/images/empty/thumbnail.jpg 150 150 John John https://secure.gravatar.com/avatar/d46e5c6154304cf75a0c2d0059f249ab2204a9e4e458b1e88aaaa9e322763c85?s=96&d=mm&r=g May 20, 2013 September 28, 2016

Old Papers Never Die – They Only Fade Away……

John

May 20, 2013

Interacting with the Computer: a Framework

John Long Comment 1

The title remains an appropriate one. However, given its subsequent references to: ‘domains’; ‘applications’; ‘application domains’; ‘tasks’ etc, it must be assumed that the interaction is: ‘to do something’; ‘to perform tasks’; ‘to achieve aims or goals’; or some such. Further modeling of such domains/applications, beyond that of text processing, would be required for any re-publication of the paper and in the light of advances in computing technology – see earlier. The issue is pervasive – see also Comments 6, 35, 37, 40 and 41.

Comment 2

‘A Framework’ is also considered to be appropriate. Better than ‘a conception’, which promises greater completeness, coherence and fitness-for-purpose (unless, of course, these criteria are explicitly taken on-board). However, the Framework must explicitly declare and own its purpose, as later set out in the paper and referenced in Figure 1. See also Comments 15, 19, 27, and 42.

J. Morton, P. Barnard, N. Hammond* and J.B. Long

M.R.C. Applied Psychology Unit, Cambridge, England *also IBM Scientific Centre, Peterlee, England

Recent technological advances in the development of information processing systems will inevitably lead to a change in the nature of human-computer interaction.

Comment 3

‘Recent technological advances’ in 1979 centred around the personal, as opposed to the main-frame, computer. To-day there are a plethora of advances in computing technology – see Commentary Introduction earlier for a list of examples. A re-publication of the paper would require a major up-date to address these new applications, as well as their associated users. Any up-date would need to include additional models and tools for such address, as well as an assessment of the continued suitability of the models and tools, proposed in the ’79 paper.

Direct interactions with systems will no longer be the sole province of the sophisticated data processing professional or the skilled terminal user. In consequence, assumptions underlying human-system communication will have to be re-evaluated for a broad range of applications and users. The central issue of the present paper concerns the way in which this re-evaluation should occur.

First of all, then, we will present a characterisation of the effective model which the computer industry has of the interactive process.

Comment 4

We contrasted our ’79 models/theories with a single computer industry’s model. To-day, there are many types of HCI model/theory. A recent book on the subject listed 9 types of ‘Modern Theories’ and 6 types of ‘Contemporary Theories’ (Rogers, 2012). The ‘industry model’ has, of course, itself evolved and now takes many forms (Harper et al., 2008). Any re-publication of the ’79 paper would have to specify both with which HCI models/theories it wished to be contrasted and with what current industry models.

The shortcoming of the model is that it fails to take proper account of the nature of the user and as such can not integrate, interpret, anticipate or palliate the kinds of errors which the new user will resent making. For remember that the new user will avoid error by adopting other means of gaining his ends, which can lead either to non-use or to monstrously inefficient use. We will document some user problems in support of this contention and indicate the kinds of alternative models which we are developing in an attempt to meet this need.

The Industry’s Model (IM)

The problem we see with the industry’s model of the human-computer interaction is that it is computer-centric. In some cases, as we shall see, it will have designer-centric aspects as well.

Comment 5

In 1979, all design was carried out by software engineers. Since then, many other professionals have become involved – initially psychologists, then HCI-trained practitioners, graphic designers, ethnomethodologists, technocratic artists etc. However, most design (as opposed to user requirements gathering or evaluation) is still performed by software engineers. Any re-publication of this paper would have to identify the different sorts of design activity, to assess their relative contribution to computer- and designer-centricity respectively and the form of support appropriate to each, which it might offer – see also Comments 14, 15, 21 (iv) and (v), 27 and 41 (iv).

To start off with, consider a system designed to operate in a particular domain of activity.

Comment 6

Any re-published paper would have to develop more further the concept of ‘domain’ (see Comment 1). The development would need to address: 1. The computer’s version of the domain and its display thereof. There is no necessary one-to-one relationship (consider the pilot alarm systems in the domain of air traffic management). Software engineer designers might specify the former and HCI designers the latter; and 2. To what extent the domain is an ‘image of the world and its resources’. See Comments 1, 35, 37, 40 and 41.

In the archetypal I.M. the database is neutralised in much the same kind of way that a statistician will ritually neutralise the data on which he operates, stripping his manipulation of any meaning other than the purely numerical one his equations impose upon the underlying reality. This arises because the only version of the domain which exists at the interface is that one which is expressed in the computer. This version, perhaps created by an expert systems analyst on the best logical grounds and the most efficient, perhaps, for the computations which have to be performed, becomes the one to which the user must conform. This singular and logical version of the domain will, at best, be neutral from the point of view of the user. More often it will be an alien creature, isolating the user and mocking him with its image of the world and its resources to which he must haplessly conform.

Florid language? But listen to the user talking.

Comment 7

The ’79 user data are now quite out of date, both in terms of their content, means of acquisition and associated technology, compared with more recent data. However, current user-experience continues to have much in common with that of the past. Up-dated data are required to confirm this continuity.

“We come into contact with computer people, a great many of whom talk a very alien language, and you have constant difficulty in trying to sort out this kind of mid-Atlantic jargon.”

“We were slung towards what in my opinion is a pretty inadequate manual and told to get on with it”

“We found we were getting messages back through the terminal saying there’s not sufficient space on the machine. Now how in Hell’s name are we supposed to know whether there’s sufficient space on the machine?” .

In addition the industry’s model does not really include the learning process; nor does it always take adequate note of individual’s abilities and experience:

“Documentation alone is not sufficient; there needs to be the personal touch as well . ”

“Social work being much more of an art than a science then we are talking about people who are basically not very numerate beginning to use a machine which seems to be essentially numerate.”

Even if training is included in the final package it is never in the design model. Is there anyone here, who, faced with a design choice asked the questions “Which option will be the easiest to describe to the naive user? Which option will be easiest to understand? Which option will be easiest to learn and remember?”

Comment 8

Naive users, of course, continue to exist to-day. However, there are many more types of users than naive and professional of interest to current HCI researchers. Differences exist between users of associated technologies (robotic versus ambient); from different demographics (old versus young); at different stages of development (nursery versus teenage children); from different cultures (developed versus less developed) etc. These different types of user would need some consideration in any re-publication.

Let us note again the discrepancy between the I.M. view of error and ours . For us errors are an indication of something wrong with the system or an indication of the way in which training should proceed. In the I.M. errors are an integral part of the interaction. For the onlooker the most impressive part of a D.P. interaction is not that it is error free but that the error recovery procedures are so well practised that it is difficult to recognise them for what they are .

Comment 9

As well as this important distinction, concerning errors, they need to be related to ‘domains’, applications’ and ‘effectiveness’ or ‘performance’ and not just user (or indeed computer) behaviour. See Comment 6 earlier and Comments 35, 36, 37 and 38 later.

Errorless performance may not be acceptable (consider air traffic expedition). Errorful behaviour may be acceptable (consider some e-mail errors). A re-published ’79 paper would have to take an analytic/technical(that is Framework grounded) view of error and not just a simple adoption of users’ (lay-language) expression. This problem is ubiquitous in HCI, both past and present.

We would not want it thought that we felt the industry was totally arbitrary . There are a number of natural guiding principles which most designers would adhere to. See also Comment 16.

Comment 10

We contrast here two types of principle, which designers might adhere to: 1. IM principles, as ‘intuitive, non-systematic, not totally arbitrary’; and our proposed principles, as ‘systematic’. In the light of this contrast, we need to set out clearly: 1. What and how are our principles ‘systematic’? and 2. How does this systematicity guarantee/support better design?

Note that in Figure 1 later, there is an ‘output to system designers’. Is this output expressed in (systematic) principles? If not, what would be its form of expression? Any form of expression would raise the same issues raised earlier for ‘sysematic principles’.

We do not anticipate meeting a system in which the command DESTROY has the effect of preserving the information currently displayed while PRESERVE had the effect of erasing the operating system. However , the principles employed are intuitive and non-systematic. Above all they make the error of embodying the belief that just as there can only be one appropriate representation of the domain, so there is only one kind of human mind.

A nice example of a partial use of language constraints is provided by a statistical package called GENSTAT. This package permits users to have permanent userfiles and also temporary storage in a workfile. The set of commands associated with these facilities are :

PUT – copies from core to workfile

GET – copies from workfile to core

FILE – defines a userfile

SAVE – copies from workfile to userfile

FETCH – copies from userfile to workfile

The commands certainly have the merit that they have the expected directionality with respect to the user. However to what extent do, for example, FETCH and GET relate naturally to the functions they have been assigned? No doubt the designers have strong intuitions about these assignments. So do users and they do not concur. We asked 40 people here at the A. P.U. which way round they thought the assignments should go: nineteen of these agreed with the system designers, 21 went the 0ther way . The confidence levels of rationalisations were very convincing on both sides!

The problem then, is not just that systems tend to be designer-centric but that the designers have the wrong model either of the learning process or of the non-D.P. users’ attitude toward error. A part-time user is going to be susceptible to memory failure and, in particular, to interference from outside the computer system. du Boulay and O’ Shea [I] note that naive users can use THEN in the sense of ‘next’ rather then as ‘implies’. This is inconceivable to the IM for THEN is almost certainly a full homonym for most D.P. and the appropriate meaning the appropriate meaning thoroughly context-determined .

Comment 11

The GENSTAT example was so good for our purposes, that it has taken considerable reflection to wonder if there really is a natural language solution, which would avoid memory failure and/or interference. It is certainly not obvious.

The alternative would be to add information to a menu or somesuch (rather like in our example). But this is just the sort of solution IM software engineers might propose. Where would that leave any ‘systematic’ principles’? – see Comment 10 earlier.

An Alternative to the Industry Model

The central assumption for the system of the future will be ‘systems match people’ rather than ‘people match systems’. Not entirely so, as we shall elaborate, for in principle, the capacity and perspectives of the user with respect to a task domain could well change through interaction with a computer system.

Comment 12

In general, the alternative aims to those of the IM promise well. The mismatch, however, seems to be expressed at a more abstract level than that of the ‘task domain’ – the ‘alien creature, isolating the user and mocking him with its image of the world and its resources to which he must haplessly conform’ – see earlier in the paper. Suppose the mismatch is at this specific level, where does this leave, for example, the natural language mismatch? Of course, we could characterise domain-specific mismatches, for example, the contrasting references to ambient environment in air- and sea-traffic management, although for professional, not for naive users. Such mismatches would require a form of domain model absent from the original paper. However, the same issue arises in the domains of letter writing and planning by means of ‘to do’ lists. Either way, the application domain mismatch needs to be addressed, along with that of natural language.

But the capacity to change is more limited than the variety available in the system .

Comment 13

The contrast ‘personal versus mainframe computer’ and the parallel contrast ‘occasional/naive versus professional user’ served us very well in ’79. But the explosion of new computing technology (see Comment 3 earlier) and associated users requires a more refined set of contrasts. There are, of course, still occasional naive users; but these are mainly in the older population and constitute a modest percentage of current users. However, with demographic changes and a longer-living older population, it would not be an uninteresting sub-set of all present users. A re-publication, which wanted to restrict its range in the manner of the ’79 paper, might address ‘older users’ and domestic/personal computing technology. An interesting associated domain might be ‘computer-supported co-operative social and health care’. We could be our own ‘subjects, targets, researchers, and designers’, as per Figure 1 later.

Our task, then, is to characterise the mismatch between man and computer in such a way that permits us to direct the designer’s effort.

Comment 14

Directing the designer’s efforts are strong words and need to be linked to the notion and guarantee of principles – see Comment 10 and Figure 1 ‘output to designers’. Such direction of design needs to be aligned with scientific/applied scientific or engineering aims (see Comments 15 and 18).

In doing this we are developing two kinds of tool, conceptual and empirical. These interrelate within an overall scheme for researching human-computer interaction as shown in Figure 1.

Comment 15

Figure 1 raises many issues:

1. Empirical studies require their own form of conceptualisation, for example: ‘problems’; ‘variables’; ‘tasks’ etc. These concepts would need specification before they could be conceptualised in the form of multiple models and operationalised for system designers.

2. What is the relationship between ‘hypothesis’ and the thories/knowledge of Psychology? Would the latter inform the former? If so, how exactly? This remains an endemic problem for applied science (see Kuhn, 1970).

3. Are ‘models’, as represented here, in some (or any) sense Psychological theories or knowledge? The point needs to be clarified – see also Comment 15 (1) earlier.

4. What might be the ‘output to system designers’ – guidelines; principles; systematic heuristics; hints and tips; novel design solutions; methods; education/training etc? See also Comment 14.

5. How is the ‘output to system designers’ to be validated? There is no arrow back to either ‘models’ or ‘working hypotheses’. At the very least, validation requires: conceptualisation; operationalisation; test; and generalisation. But with respect to what – hypotheses for understanding phenomena or with respect to designing artefacts?

Relating Conceptual and Empirical Tools

Comment 16

The relationship between conceptual and analytic tools and their illustration reads like engineering. In ’79, I thought that we were doing ‘applied science’ (following in the footsteps of Donald Broadbent, the MRC/APU’s director in 1979). The distinction between engineering and applied science needs clarification in any republished version of the original paper.

Interestingly enough, Card, Moran and Newell (1983) claimed to be doing ‘engineering’. Their primary models were the Human Information Processing (HIP) Model and the Goals, Operators, Methods and Strategies (GOMS) Model. There is some interesting overlap with some of our multiple models; but also important differences. One option for a republished paper would be to keep to the ’79 multiple models. An alternative option would to augment the HIP and GOMS with the ’79 multiple models (or vice versa), to offer a (more) complete expression of either approach taken separately.

The conceptual tools involve the development of a set of analytic frameworks appropriate to human computer interaction. The empirical tools involve the development of valid test procedures both for the introduction of new systems and the proving of the analytic tools. The two kinds of tool are viewed as fulfilling functions comparable to the role of analytic and empirical tools in the development of technology. They may be compared with the analytic role of physics, metallurgy and aerodynamics in the development of aircraft on the one hand and the empirical role of a wind tunnel in simulating flight on the other hand.

Empirical Tools

The first class of empirical tool we have employed is the observational field study, with which we aim to identify some of the variables underlying both the occasional user’s perceptions of the problems he encounters in the use of a computer system, and the behaviour of the user at the terminal itself.

Comment 17

Observational field studies have undergone considerable development since ’79. Many have become ethnomethodological studies, to understand the context of use, others have become front-ends to user-centred design methodologies, intended to be conducted in parallel to those of software engineering. Neither sort of development is addressed by our original paper. Both raise numerous issues, including: the mutation of lay-language into technical language; the relationship between user opinions/attitudes and behaviour; the relationship between the simulation of domains of application and experimental studies; the integration of multiple variables into design; etc.

The opinions cited above were obtained in a study of occasional users discussing the introduction and use of a system in a local government centre [2]. The discussions were collected using a technique which is particularly free from observer influence [3 ].

In a second field study we obtained performance protocols by monitoring users while they solved a predefined set of problems using a data base manipulation language [4 ]. We recorded both terminal performance and a running commentary which we asked the user to make, and wedded these to the state of the machine to give a total picture of the interaction. The protocols have proved to be a rich source of classes of user problem from which hypotheses concerning the causes of particular types of mismatch can be generated.

Comment 18

HCI has never given the concept of ‘classes of user problem’ the attention that it deserves. Clearly, HCI has a need for generality (see Comment 10, concerning (systematic) principles with their implications of generalisation). Of course, generalising over user problems is critical; but so more comprehensively is generalising over ‘design problems’. The latter might express the ineffectiveness of users interacting with computers to perform tasks (or somesuch). The original paper does not really say much about generalisation – its conceptualisation; operationalisation; test; and – taken together – validation. Any republication would have to rise to this challenge.

Comment 19

The concept of ’cause’ here is redolent of science, for example, as in Psychology. See also Comment 18, as concerns phenomena and Comment 15 for a contrast with engineering. Science and engineering are very different disciplines. Any re-publication would have to address this difference and to locate the multiple models and their application with respect to it.

There is thus a close interplay between these field studies, the generation of working hypotheses and the development of the conceptual frameworks. We give some extracts from this study in a later section.

Comment 20

This claim would hold for both a scientific (or applied scientific) and an engineering endeavour. See also Comments 15 and 18 earlier. However, both would be required to align themselves with Figures 1 and 2 of the original paper.

A third type of empirical tool is used to test specific predictions of the working hypothesis.

Comment 21

The testing of predictions (which in conjunction with the explanation of phenomena, together constituting understanding) suggests the notion of science (see Comments 18 and 19), which can be contrasted with the prescription of design solutions (which in conjunction with the diagnosis of design problems, together constituting design of artefacts), as engineering (see Comment 15). The difference concerning the purpose of multiple models needs clarification.

The tool is a multi-level interactive system which enables the experimenter to simulate a variety of user interfaces, and is capable of modeling and testing a wide range of variables [5]. It is based on a code-breaking task in which users perform a variety of string-manipulation and editing functions on coded messages.

It allows the systematic evaluation of notational, semantic and syntactic variables. Among the results to be extensively reported elsewhere is that if there is a common argument in a set of commands, each of which takes two arguments, then the common argument must come first for greatest ease of use. Consistency of argument order is not enough: when the common argument consistently comes second no advantage is obtained relative to inconsistent ordering of arguments [6].

Comment 22

The 2-argument example is persuasive on the face of it; but is it a ‘principle’ (see Comment 10) and might it appear in the ‘output to designers’ (Figure 1 and Comment 15(4))? If so, how is its domain independence established? This point raises again the issue of generalisation – see also Comment 17.

Conceptual Tools

Since we conceive the problem as a cognitive one, the tools are from the cognitive sciences.

Comment 23

The claim is in no way controversial. However, it raises the question of whether the interplay between these cognitive tools and the working hypotheses (see Figure 1) also contribute to Cognitive Science (that is, Psychology)? See also Comment 15(3). Such a contribution would be in addition to the ‘output to designers’ of Figure 1.

Also we define the problem as one with those users who would be considered intellectually and motivationally qualified by any normal standards. Thus we do not admit as a potential solution that of finding “better” personnel, or simply paying them more, even if such a solution were practicable.

Comment 24

If ‘design problem’ replaced ‘user problem’ (see also Comment 18), then better personnel and/or better pay might indeed contribute to the design (solution) of the design problem. The two types of problem, that is, design problem and user problem need to be related and grounded in the Framework. The latter, for example, might be conceptualised as a sub-set of the former. Eitherway, additional conceptualisation of the Framework is required. See also Comment 18.

The cognitive incompatibility we describe is qualitative not quantitative and the mismatch we are looking for is one between the user’s concept of the system structure and the real structure: between the way the data base is organised in the machine and the way it is organised in the head of the user: the way in which system details are usually encountered by the user and his preferred mode of learning.

The interaction of human and computer in a problem-solving environment is a complex matter and we cannot find sufficient theory in the psychological literature to support our intuitive needs. He have found it necessary to produce our own theories, drawing mainly on the spirit rather than the substance of established work.

Comment 25

It sounds like our ‘own’ theories are indeed psychological theories (or would be if constructed). See also Comments 21 and 23.

Further than this, it is apparent that the problem is too complex for us to be able to use a single theoretical representation.

Comment 26

Decomposition (as in multiple models) is a well-tried and trusted solution to complexity. However, re-integration will be necessary at some stage and for some purpose. Understanding (Psychology) and design of artefacts (HCI) would be two such (different) purposes. They need to be distinguished. See also Comment 15(5).

The model should not only be appropriate for design, it should also give a means of characterising errors – so as to understand their origins and enable corrective measures to be taken.

Comment 27

What characterises a ‘model appropriate for design’? (see also Comment 15(4) and(5)). Design would have to be conceptualised for this purpose. Features might be derived from field studies of designer practice (see Figure 1); but a conceptualisation would not be ‘given’; but would have to be constructed (in the manner of the models). This construction would be a non-trivial undertaking. But how else could models be assured to be fit-for-(design)purpose? See also Comment 14).

Take the following protocol.

The user is asked to find the average age of entries in the block called PEOPLE.

“I’ll have a go and see what happens” types: *T <-AVG(AGE,PEOPlE)

machine response: AGE – UNSET BLOCK

“Yes, wrong, we have an unset block. So it’s reading AGE as a block, so if we try AGE and PEOPLE the other way round maybe that’ll work.”

This is very easy to diagnose and correct. The natural language way of talking about the target of the operation is mapped straight into the argument order. The cure would be to reverse the argument order for the function AVG to make it compatible.

Comment 28

Natural language here is used both to diagnose ‘user problems’ and to propose solutions to those problems. Natural language, however, does not appear in the paper as a model, as such. Its extensive nature in psychology/linguistics would prohibit such inclusion. Further, there are many theories of natural language and no agreement as to their state of validation (or rejection). However, the model appears as a block in the BIM (see Figure 2). The model/representation, of course, might be intuitive, in the form and practice of lay-language, which we all possess. However, such intuitions would also be available to software engineers and would not distinguish systematic from non-systematic principles ( see Comment 10). The issue would need to be addressed in any re-publication of the ’79 paper.

The next protocol is more obscure. The task is the same as in the preceding one.

“We can ask it (the computer) to bring to the terminal the average value of this attribute.”

types: *T -AVG( AGE)

machine response: AVG(AGE) – ILLEGAL NAME

“Ar.d it’s still illegal. .. ( … ) I’ve got to specify the block as well as the attribute name.”

Well of course you have to specify the block. How else is the machine going to know what you’re talking about? A very natural I.M. response. How can we be responsible for feeble memories like this.

However, a more careful diagnosis reveals that the block PEOPLE is the topic of the ‘conversation’ in any case.

Comment 29

Is ‘topic of conversation’, as used here an intuition, derived from lay-language or a sub-set of some natural language theory, derived form Psychology/Linguistics? This is a good example of the issue raised by Comment 28. The same question could be asked of the use of ‘natural language conventions’, which follows next.

The block has just been used and the natural language conventions are quite clear on the point.

We have similar evidence for the importance of human-machine discourse structures from the experiment using the code-breaking task described above. Command strings seem to be more ‘cognitively compatible’ when the subject of discourse (the common argument) is placed before the variable argument. This is perhaps analogous to the predisposition in sentence expression for stating information which is known or assumed before information which is new [7]. We are currently investigating this influence of natural language on command string compatibility in more detail.

Comment 30

These natural language interpretations and the associated argumentation remain both attractive and plausible. However, command languages in general (with the exception of programmers) have fallen out of favour. Given the concept of the domain of application/tasks and the requirements of the Goal Structure Model, some addition to the natural language model would likely be required for any re-publication of the ’79 paper. Some relevance-related, plan-based speech act theory might commend itself in this case.

The Block Interaction Model

Comment 31

The BIM remains a very interesting and challenging model and was (and remains) ahead of its time. For example, the very inclusion of the concept of domain (as a hospital; jobs in an employment agency etc); but, in addition, the associated representations of the user, the computer and the workbase. Thirty-four years later, HCI researchers are still ‘trying to pick the bits/blocks out of that’ in complex domains such as air traffic and emergency services management. Further development of the BIM in the form of more completely modeled exemplars would be required by any republished paper.

Systematic evidence from empirical studies, together with experience of our own, has led us to develop a conceptual analysis of the information in the head of the user (see figure 2). Our aim with one form of analysis is to identify as many separable kinds of knowledge as possible and chart their actual or potential interactions with one another. Our convention here is to use a block diagram with arrows indicating potential forms of interference. This diagram enables us to classify and thus group examples of interference so that they could be counteracted in a coordinated fashion rather than piecemeal. It also enables us to establish a framework within which to recognise the origin of problems which we haven’t seen before. Figure 2 is a simplified form of this model. The blocks with double boundaries, connected by double lines, indicate the blocks of information used by the ideal user. The other lines indicate prime classes of interference. The terminology we have used is fairly straightforward: Domain – the range of the specific application of a system. This could be a hospital, a city’s buildings, a set of knowledge such as jobs in ~n employment agency. Objects – the elements in the particular data base. They could be a relational table, patients’ records. I Representation of domain I Representa ti on of work-base version of domain domain Representation of problem Operations – the computer routines which manipulates the objects. Labels – the letter sequences which activate operators which, together with arguments and syntax, constitute the commands. Work base – in general, people using computer systems for problem solving have had experience of working in a non-computerised work environment either preceding the computerisation or at least in parallel with the computer system. The representation of this experience we call the work-base version. There will be overlap between this and the users representation of the computer’s version of the domain; but there will be differences as well, and these differences we would count as potential sources of interference. There may be differences in ·the underlying structure of the data in the two cases, for example, and will certainly be differences in the objects used. Thus a user found to be indulging in complex checking procedures after using the command FILE turned out to be perplexed that the material filed was still present on the screen. With pieces of paper, things which are filed actually go there rather than being copied. Here are some examples of interference from one of our empirical studies [4]:

Interference on the syntax from other languages. Subject inserts necessary blanks to keep the strings a fixed length.

“Now that’s Matthewson, that’s 4,7, 10 letters, so I want 4 blanks”

types: A+<:S:NAME = ‘MATTHEWSON ‘:>PEOPLE

Generalised interference

“Having learned how reasonably well to manipulate one system, I was presented with a totally different thing which takes months to learn again.”

Interference of other machine characteristics on machine view

“I’m thinking that the bottom line is the line I’m actually going to input. So I couldn’t understand why it wasn’t lit up at the bottom there, because when you’re doing it on (another system) it’s always the bottom line.”

Comment 32

These examples do not do justice to the BIM – see Comment 31. More complete and complex illustrations are required.

The B.I.M. can be used in two ways. We have illustrated its utility in pinpointing the kinds of interference which can occur from inappropriate kinds of information. We could look at the interactions in just the opposite way and seek ways of maximising the benefits of overlap. This is, of course, the essence of ‘cognitive compatibility’ which we have already mentioned. Trivially, the closer the computer version of the domain maps onto the user’s own version of the domain the better. What is less obviou~ is that any deviations should be systematic where possible.

Comment 33

In complex domains (see Comment 31), the user’s own model is almost always implicit. Modeling that representation is itself non-trivial. A re-published paper would have to make at least a good stab at it.

In the same way, it is pointless to design half the commands so that they are compatible with the natural language equivalents and use this as a training point if the other half, for no clear reason, deviate from the principle. If there are deviations then they should form a natural sub-class or the compatibility of the other commands will be wasted.

Information Structures

In the block interaction model we leave the blocks ill-defined as far as their content is concerned. Note that we have used individual examples for user protocols as well as general principles in justifying and expanding upon the distinctions we find necessary. What we fail to do in the B. I .M. is to characterise the sum of knowledge which an individual user carries around with him or brings to bear upon the interaction. We have a clear idea of cognitive compatibility at the level of an individual. If this idea is to pay then these structures must be more detailed.

There is no single way of talking about information structures. At one extreme there is the picture of the user’s knowledge as it apparently reveals itself in the interaction; the view, as it were, that the terminal has of its interlocutor. From this point of view the motivation for any key press is irrelevant. This is clearly a gross oversimplification.

The next stage can be achieved by means of a protocol. In it we would wish to separate out those actions which spring from the users concept of the machine and those actions which were a result of him being forced to do something to keep the interaction going. This we call ‘heuristic behaviour’. This can take the form of guessing that the piece of information which is missing will be consistent with some other system or machine. “If in doubt, assume that it is Fortran” would be a good example of this. The user can also attempt to generalise from aspects of the current system he knows about. One example from our study was where the machine apparently failed to provide what the user expected. In fact it had but the information was not what he had expected. The system was ready for another command but the user thought it was in some kind of a pending state, waiting with the information he wanted. In certain other stages – in particular where a command has produced a result which fills up the screen – he had to press the ENTER key – in this case to clear the screen. The user then over-generalised from this to the new situation and pressed the ENTER key again, remarking

“Try pressing ENTER again and see what happens.”

We would not want to count the user’s behaviour in this sequence as representing his knowledge of the system – either correct knowledge or incorrect knowledge. He had to do something and couldn’t think of anything else. When the heuristic behaviour is eliminated we are left with a set of information relevant to the interaction. With respect to the full, ideal set of such information, this will be deficient with respect to the points, at which the user had to trust to heuristic behaviour.

Comment 34

The concept of ‘heuristic behaviour’ has never received the attention that it deserves in HCI research, although it must be recognised that much user interactive behaviour is of this kind. The proliferation of new interactive technologies (see Comment 3) is likely to increase this type of behviour by users attempting to generalise across technologies. A re-published paper would have better to relate the dimension of heuristic to that of correctness both with respect to user knowledge and user behaviour.

Note that it will also contain incorrect information as well as correct information; all of it would be categorised by the user as what he knew, if not all with complete confidence, certainly with more confidence than his heuristic behaviour. The thing which is missing from B.I.M. and I.S. is any notion of the dynamics of the interaction. We find we need three additional notations at the moment to do this. One of these describes the planning activity of the user, one charts the changes in state of user and machine and one looks at the general cognitive processes which are mobilised.

Comment 35

The list of models required, in addition to the B.I.M. and the I.S. is comprehensive – planning, user-machine state changes, and cognitive processes. However, it might be argued that yet another model is required – one which maps the changes of the domain as a function of the user-computer interactive behaviours. The domain can be modeled as object-attribute-state (or value) changes, resulting from user-computer behaviours, supported respectively by user-computer structures. Such models currently exist and could be exploited by any re-published paper.

Goal Structure Model

The user does some preparatory work before he presses a key. He must formulate some kind of plan, however rudimentary. This plan can be represented, at least partially, as a hierarchical organisation. At the top might be goals such as “Solve problem p” and at the bottom “Get the computer to display Table T”. The Goal Structure model will show the relationships among the goals.

Comment 36

The G.S.M. is a requirement for designing human-computer interactions. However, it needs to be related in turn to the domain model (see Comments 31, 32 and 33). In the example, the document in the G.S.M. is transformed by the interactive user-computer behaviours from ‘unedited’ to ‘edited’. Any hierarchy in the G.S.M. must take account of any other type of hierarchy, for example, ‘natural’, represented in the domain model (see also Comment 35). The whole issue of so-called situated plans a la Suchman would have to be addressed and seriously re-assessed (see also Comment 37).

This can be compared with the way of structuring the task imposed by the computer. For example, a user’s concept of editing might lead to the goal structure:

Comment 37

HCI research has never recovered from loosing the baby with the bath-water, following Suchman’s proposals concerning so-called ‘situated actions’. Using the G.S.M, a republished paper could bring some much needed order to the concepts of planning. Even the simple examples provided here make clear that such ordering is possible.

Two problems would arise here. Firstly the new file has to be opened at an ‘unnatural’ place. Secondly the acceptance of the edited text changes from being a part of the editing process to being a part of the filing process.

The goal structure model, then, gives us a way of describing such structural aspects of the user’s performance and the machines requirements. Note that such goals might be created in advance or at the time a node is evaluated. Thus the relationship of the GSM to real time is not simple.

The technique for determining the goal structure may be as simple as asking the user “What are you trying to do right now and why?” This,may be sufficient to reveal procedures which are inappropriate for the program being used.

Comment 38

Complex domain models, for example, of air traffic management and control would require more sophisticated elicitation procedures than simple user questioning. User knowledge, supporting highly skilled and complex tasks is notoriously difficult to pin down, given its implicit nature. So-called ‘domain experts’ would be a possible substitute; but that approach raises problems of its own (for example, when experts disagree). A re-published paper would at least have to recognise this problem.

State Transition Model

In the course of an interaction with a system a number of changes take place in the state of the machine. At the same time the user’s perception of the machine state is changing. It will happen that the user misjudges the effect of one command and thereafter’ enters others which from an outside point of view seem almost random. Our point is, as before, that the interaction can only be understood from the point of view of the user.

Comment 39

The S.T.M. needs in turn to be related to the domain model (See Comments 31 and 35). These required linkings raise the whole issue of multiple-model re-integration (see also Comment 26).

This brings us to the third of the dynamic aspects of the interaction: the progress of the user as he learns about the system.

Comment 40

As with the case of ‘heuristic behaviour’, HCI research has never treated seriously enough the issue of ‘user learning’. Most experiments record only initial engagement with an application or at least limited exposure. Observational studies sometimes do better. We are right to claim that users learn (and attempt to generalise). Designers, of course, are doing the same thing, which results in (at least) two moving targets. Given our emphasis on ‘cognitive mismatch’ and the associated concept of ‘interference’, we need to be able to address the issue of user learning in a convincing manner, at least for the purposes in hand.

Let us explore some ways of representing such changes. Take first of all the state of the computer. This change is a result of user actions and can thus be represented as a sequence of Machine States (M.S.) consequent on user action.

If the interaction is error free, changes in the representations would follow changes in the machine states in a homologous manner. Errors will occur if the actual machine state does not match its representation.

Comment 41

At some stage and for some purpose, the S.T.S surely needs to be related to the G.S.M. (and or the domain model). Such a relationship would raise a number of issues, for example, ‘errors’ (see Comment 9) and the need to integrate multiple-models (see also Comments 26 and 39).

We will now look at errors made by a user of an interactive data enquiry system. We will see errors which reveal both the inadequate knowledge of the particular machine state or inadequate knowledge of the actions governing transitions between states. The relevant components of the machine are the information on the terminal display and the state of a flag shown at the bottom right hand corner of the display which ‘informs the user of some aspects of the machine state (ENTER … or OUTPUT … ). In addition there is a prompt, “?”, which indicates that the keyboard is free to be used, there is a key labelled ENTER. In the particular example the user wishes to list the blocks of data he has in his workspace. The required sequence of machine states and actions is:

The machine echoes the command and waits with OUTPUT flag showing.

User: “Nothing happening. We’ve got an OUTPUT there in the corner I don’ t know what that means.

The user had no knowledge of MS2: we can hypothesise his representation of the transition to be:

This is the result of an overgeneralisation. Commands are obeyed immediately if the result is short, unless the result is block data of any size. The point of this is that the data may otherwise wipe everything from the screen. With block data the controlling program has no lookahead to check the size and must itself simply demand the block, putting itself in the hands of some other controlling program. We see here then a case where the user needs to have some fairly detailed and otherwise irrelevant information about the workings of the system in order to make sense of (as opposed to learn by rote) a particular restriction.

The user was told how to proceed, types ENTER, and the list of blocks is displayed together with the next prompt. However, further difficulties arise because the list of blocks includes only one name and the user was expecting a longer listing. Consequently he misconstrues the state of the machine. (continuing from previous example)

User types ENTER

Machine replies with block list and prompt.

Flag set to ENTER …

“Ah, good, so we must have got it right then.

A question mark: (the prompt). It doesn’t give me a listing. Try pressing ENTER again and see what happens.”

User types ENTER

“No? Ah, I see. Is that one absolute block, is that the only blocks there are in the workspace?”

This interaction indicates that the user has derived a general rule for the interaction:

“If in doubt press ENTER”

After this the user realises that there was only one name in the list. Unfortunately his second press of the ENTER key has put the machine into Edit mode and the user thinks he is in command mode. As would be expected the results are strange.

At this stage we can show the machine state transitions and the user’s representation together in a single diagram, figure 3.

This might not be elegant but it captures a lot of features of the interaction which might otherwise be missed.

Comment 42

The S.T.M. includes ‘machine states’ and the user’s representation thereof. Differences between the two are likely to identify both errors and cognitive mismatches. However, the consequences – effective or ineffective interactions and domain transformations – are not represented; but need to be related to the G.S.M. ( and the domain model). This raises, yet again, the issue of the relations between multiple-models required in the design process (see Figure 1 and Comments 26 and 39).

The final model we use calls upon models currently available in cognitive psychology which deal with the dynamics of word recognition and production, language analysis and information storage and retrieval. The use of this model is too complex for us to attempt a summary here.

Comment 43

Address of the I.P.M. is noticeable only by its intended absence. This may have been an appropriate move at the time. However, any re-published paper would have to take the matter further. In so doing, at least the following issues would need to be addressed:

1. The selection of appropriate Psychology/Language I.P.M.s, of which there are very many, all in different states of development and validation (or rejection). (Note Card et al’s synthesis and simplification of such a model in the form of the HIP – see Comment 15).

2. The relation of the I.P.M. to all other models (see Comments 26, 35, 41 and 42).

3. The need to tailor any I.P.M. to the particular domain of concern to any application, for example, air traffic management (see Comments 6 and 39).

4. The level of description of the I.P.M. See also 1. above.

5. The use of any I.P.M. by designers (see Figure 1).

6. The ‘guarantee’ that Psychology brings to such models in the case of their use in design and the nature of its validation.

Conclusion

We have stressed the shortcomings of what we have called the Industrial Model and have indicated that the new user will deviate considerably from this model. In its place we have suggested an alternative approach involving both empirical evaluations of system use and the systematic development of conceptual analyses appropriate to the domain of person-system interaction. There are, of course, aspects of the I.M. which we have no reason to disagree with, for example, the idea that the computer can beneficially transform the users view of the problems with which he is occupied. However, we would appreciate it if someone would take the trouble to support this point with clear documentation. So far as we can see it is simply asserted.

Comment 44

In the 34 years, following publication of our original paper, numerous industry practitioners, trained in HCI models and methods, would claim to have produced ‘clear documentation’, showing that the ‘computer can beneficially transform the user’s view of the problems with which he is occupied’. This raises the whole (and vexed) question of how HCI has moved on since 1979, both in terms of the number and effectiveness of trained/educated HCI practitioners. HCI community progress, clear to everyone, needs to be contrasted with HCI discipline progress, unclear to some.

Finally we would like to stress that nothing we have said is meant to be a solution – other than the methods. We do not take sides for example, on the debate as to whether or not interactions should be in natural language – for we think the question itself is a gross oversimplification. What we do know is that natural language interferes with the interaction and that we need to understand the nature of this interference and to discover principled ways of avoiding it.

Comment 45

Natural language understanding and interference smacks of science. Principled ways of avoiding interference smacks of engineering. What is the relationship between the two? What is the rationale for the relationship? What is the added-value to design (see also Comment 15).

And what we know above all is that the new user is most emphatically not made in the image of the designer.

Comment 46

The original paper essentially conceptualises and illustrates the need for the proposed ‘Framework for HCI’. That was evil, sufficient unto the day thereof. However, what it lacks thirty-four years later is any exemplars, for example, following Kuhn’s requirement for knowledge development and validation. The exemplars would be needed for any re-publication of the paper and would require the complete, coherant and fit-for-purpose – operationalisation, test and generalisation of the Framework, as set out in Figure 1. A busy time for someone……

References

[1 ] du Boulay, B. and O’Shea, T. Seeing the works: a strategy of teaching interactive programming. Paper presented at Workshop on ‘Computing Skills and Adaptive Systems’, Liverpool, March 1978.

[2] Hammond, N.V., Long, J.B. and Clark, l.A. Introducing the interactive computer at work: the users’ views. Paper presented at Workshop on ‘Computing Skills and Adaptive Systems’, Liverpool, March 1978.

[3] Wilson. T. Choosing social factors which should determine telecommunications hardware design and implementation. Paper presented at Eighth International Symposium on Human Factors in Telecommunications, Cambridge, September 1977.

[4] Documenting Human-computer Mismatch with the occasional interactive user. APU/IBM project report no. 3, MRC Applied Psychology Unit. Cambridge, September 1978.

[5] Hammond, N.V. and Barnard, P.J. An interactive test vehicle for the investigation of man-computer interaction. Paper presented at BPS Mathematical and Statistical Section Meeting on ‘Laboratory Work Achievable only by Using a Computer’, London, September 1978.

[6] An interactive test vehicle for the study of man-computer interaction. APU/IBM project report no. 1,MRC Applied Psychology Unit, Cambridge, September 1978.

[7] Halliday, M.A.K. Notes on transitivity and theme in English. Part 1. Journal of Linguistics, 1967, 3, 199-244.

FIGURE 3: STATE TRANSITION EXAMPLE

John

Expressing the Effectiveness of Planning Horizons

P. TIMMER and J LONG

Ergonomics & HCI Unit, University College London, 26 Bedford Way, London, WC1H 0AP

Expression de l’efficacité des horizons de planification

RÉSUMÉ

IV. Model of a Planning Horizon

Figure 4. Domain model performance data for LOG

VI. REFERENCES

Planning for Multiple Task Work – an Analysis of a Medical Reception Worksystem

Becky Hill, John Long, Walter Smith and Andy Whitefield

HCI is more than the Usability of Web Pages: a Domain Approach

John Long

Abstract

1. Introduction

1.1 Usability and HCI

1.2 Web Pages and HCI

1.3 Usability, Web Pages and HCI

2. General Domain Approach to HCI

2.1 Limited View of HCI

2.2 Particular Domain Approach to HCI

3. Domain Approach Illustrations

3.1 Implicit Domain Model Illustrations

3.1.1 Domain of Medical Reception

3.1.2 Domain of Military Command and Control

3.1.3 Domain of Domestic Energy Management

3.2 Explicit Domain Model Illustrations

3.2.1 Domain of Amphibious Landing Off-load Planning

3.2.2 Domain of Emergency Management

3.2.3 Domain of Air Traffic Management

Acknowledgements:

4. References

Towards a Conception of HCI Engineering Design

S. Cummaford and J. Long

Validating Effective Design Knowledge for Re-Use: HCI Engineering Design Principles

Stephen Cummaford

INTRODUCTION

ENGINEERING DESIGN PRINCIPLES

EDP components

CLASSES OF DESIGN PROBLEM

FUTURE RESEARCH

ACKNOWLEDGEMENTS

REFERENCES

[1] Cumrnaford and Long (1998) Towards a conception of HCI engineering design principles, in T.R.G. Green, L. Bannon, C.P. Warren & J. Buckley (eds.) Proceedings of ECCE-9, the Ninth European Conference on Cognitive Ergonomics. EACE, pp. 79-84.

Solving class design problems: towards developing Engineering Design Principles

Steve Cummaford and John Long

Ergonomics & HCI Unit, University College London, 26 Bedford Way, London WC1H 0AP

ABSTRACT

KEYWORDS

RESEARCH NEED

DEVELOPMENT STRATEGY

SELECTION OF PROMISING CLASS

PROCESS

1: Specify SDPs

Requirement

Instantiation

Domain model

Product goal

Task-goal structure

User model

Computer model

Performance

Total physical

3: Evaluate CDP

Requirement

4: Specify CDS

Requirement

5: Specify SDSs

Requirement

Instantiation

6: Evaluate CDS

Requirement

Instantiation

Total physical

Review

DISCUSSION

Specifying SDPs

Specifying the CDP

Evaluating the CDP-SDP relations

Specifying the CDS