Annotated bibliography on the psychology of programming

Tim Mattson

Introduction

This bibliography holds references that pertain to the psychology of programming. Since the topic is closely related, a number of these references also refer to the psychological issues associated with program comprehension. To this end, I used (and freely borrowed from) the program comprehension bibliography [Upchurch].

A

[Adelson85] B. Adelson and E. Soloway, The role of domain experience in software design. IEEE Transactions on Software Engineering, 11(11), 1351-1360, 1985.

[Adelsen85a] B. Adelson, D. Littman, K. Ehrleich, J. Black, and E. Soloway, "Novice expert differences in Software design", in B. Shackel (Ed.), Human-Computer Interaction-INTERACT84, North-Holland, 1985.

Presents a four component model of the software design process: (1) A "design meta-script", that is a highly level schematic representation whose function it is to drive the design process by setting goals for processing the sketchy model. (2) the "sketchy-model", that is the current solution state, which becomes progressively less sketchy (i.e. more concrete and elaborate) until the implementation level representation. (3) the "current long term memory set" consisting of all the known solutions appropriate to the aspects of the design currently being worked on, and (4) the "demons" which monitor the state of the "sketchy model" activating "things to remember" concerning elements to elaborate and modify in the "sketchy model" as progress is made toward the final concrete design.

[Allwood86] C.M. Allwood, 1986, Novices on the computer: a review of the literature, International Journal of Man-machine studies, vol. 25, 633-658, 1986.

[Anderson87] J.R. Anderson, "Methodologies for studying Human Knowledge". Behavioral and Brain Sciences, vol. 10 #3, pp. 467-503, 1987.

B

[Bartlett32] F.C. Bartlett. Remembering: A study in Experimental and Social Psychology, Cambridge University Press, 1932.

[Bentley86] J. Bentley and D.E. Knuth, "Literate Programming", Communications of ACM, vol. 29, #5, pp. 364-369, 1986. Also see the follow-on article by Bentley, Knuth and McIlroy in Comm. ACM vol. 29, number 6, pp. 471-483 (June 1986).

A nice description of Knuth's WEB system for literate programming. The idea is to write programs as essay-like texts optimized for reading by humans. A source file is prepared with ordinary English text and programming language statements (using Pascal) using a TeX formatting system. Tools called WEAVE and TANGLE produce either human-readable programs with commentary or an unreadable source code optimized for use by a native compiler.

[Bertholf] C. F. Bertholf, and J. Scholtz " Program Comprehension of Literate Programs by Novice Programmers". In C. R. Cook, J. C. Scholtz, and J. C. Spohrer (eds.), Empirical Studies of Programmers: Fifth Workshop. Norwood, NJ: Ablex Publishing. p. 222, 1993.

Abstract: This study compares comprehension of Lit style literate programs with that of traditional modular programs with both internal and external documentation. Literate programming [Knuth84] enhances a computer program by incorporating program text into a comprehensive design document. Although not previously well defined, we believe Knuth's concept has great intuitive appeal, fits in well with a multi-disciplinary approach to automating portions of the software engineering process, and can be adapted easily to the incorporation of empirically derived principles of program comprehension. The Lit system developed by Chris Bertholf employs many of Knuth's principles for literate style programs as well as several others; the program text is incorporated into a comprehensive design document which uses typographic cues and a book style presentation paradigm. A program description and information about design history, the task domain, and implementation are included in the program document. The table of contents provides information about the overall structure of the program. In addition, algorithms are documented in pseudo-code and documentation of anticipated modifications is included. Extensive documentation of the usage of variables, procedures, and functions is also included. Does this increased amount of documentation and the unique presentation format hinder or facilitate program comprehension? This study compared the comprehension results of 20 novice programmers randomly divided into two groups and given either a traditional modular FORTRAN program or an equivalent Lit style literate program to modify. Subjects performed the task of completing an incomplete program; all program modifications were made on paper, thus syntax errors were expected. The elapsed time to produce a solution was recorded, and several measures of comprehension were collected and analyzed. Completed programs were judged as completely correct, functionally correct with syntax errors, or incorrect. The overall result was that subjects given the literate programs found a solution more often than did subjects using the traditional modular programs. None of the subjects given the modular programs were able to produce even functionally correct solutions. In addition, none of the subjects given Lit style literate programs modified sections of code that were unrelated to the modification specification while all of the subjects given traditional modular programs modified sections of code which were unrelated to the modification specification. Similar results have also been obtained with advanced programmers in another related study. Although this study did not attempt to isolate the factors which aided in comprehension, it did show that the Lit style programs are useful for program maintenance tasks. Future research in this area should concentrate on isolating the factors that produced such a marked distinction in performance between the Lit style literate program group and the traditional program group.

[Bisantz94] A. M. Bisantz, K. J. Vicente, "Making the abstraction hierarchy concrete", Int. J. Human-Computer Studies, vol. 40, pp. 83-117, 1994.

Abstract: The abstraction hierarchy (AG) is a multileveled representation framework, consisting of physical and functional system models, which has been proposed as a useful framework for developing representations of complex work environments. Despite the fact that the AH is well known and widely cited in the cognitive engineering community, there are surprisingly few examples of its application. Accordingly, the intent of this paper is to provide a concrete example of how the AT can be applied as a knowledge representation framework. A formal instantiation of the AT as the basis for a computer program is presented in the context of a thermal-hydraulic process. This model of the system is complemented by a relatively simple reasoning mechanism which is independent of the information contained in the knowledge representation. This reasoning mechanism uses the AH model along with qualitative user input about system states to generate reasoning trajectories for different types of events and problems. Simulation output showing how the AH model can provide an effective basis for reasoning under different classes of situations, including challenging faults of various types, are presented. These detailed examples illustrate the various benefits of adopting the AH as a knowledge representation framework, namely: providing sufficient representations to allow reasoning about unanticipated fault and control situations, allowing the use of reasoning mechanisms that are independent of domain information , and having psychological relevance.
NOTE: This paper is showing how to use the AH approach to represent a physical control system. It is used to refine the system specification and define its user interface, not to guide software development.

[Brooks78] R. Brooks, " Using a Behavioral Theory of Program Comprehension in Software Engineering". Proc. 3rd Int. Conf. on Software Eng. New York: IEEE, 196-201, 1978.

Abstract: A theory is presented of how a programmer goes about understanding a program. The theory is based on a representation of knowledge about programs as a succession of knowledge domains which bridge between the problem domain and the executing program. A hypothesis and verify process is used by programmers to reconstruct these domains when they seek to understand a program.
The theory is useful in several ways in software engineering: It makes accurate predictions about the effectiveness of documentation; it can be used to systematically evaluate and critique other claims about documentation, and it may even be a useful guideline to a programmer in actually constructing documentation.

[Brooks83] R. Brooks, "Towards a theory of the comprehension of computer programs", International Journal of Man-Machine Studies, vol. 18, pp. 543-554, 1983.

Abstract: A sufficiency theory is presented of the process by which a computer programmer attempts to comprehend a program. The theory is intended to explain four sources of variation in behavior on this task: the kind of computation the program performs, the intrinsic properties of the program text, such as language and documentation, the reason for which the documentation is needed, and differences among the individuals performing the task. The starting point for the theory is an analysis of the structure of the knowledge required when a program is comprehended which views the knowledge as being organized into distinct domains which bridge between the original problem and the final program. The program comprehension process is one of reconstructing knowledge about these domains and the relationship among them. This reconstruction process is theorized to be a top-down, hypothesis driven one in which an initially vague and general hypothesis is refined and elaborated based on information extracted from the program text and other documentation.
Comments D&N: This extended treatment of Brooks's ideas about program comprehension is intended to provide an adequate descriptive model of how programmers understand programs. According to this model, when one understands a program, one has constructed a mental model of successive knowledge domains bridging from the problem domain to the domain of the program in execution. Each of these domains consists of objects, properties, relations, and operations. The succession of domains may include the problem domain, the domain of a mathematical model of the problem, the algorithm domain, the programming language domain, etc. One must also understand the relationships that exist between adjacent domains. The process of reading a program to understand it is one of constructing a model of this sort or, depending upon one's reading objectives, constructing a part of one. According to Brooks, this process is largely top-down: the reader generates hypotheses about the program, which he then attempts to verify from the code and whatever other documentation is available. This verification task is aided by beacons, code features characteristic of recurring structures or operations. Hypotheses are generated hierarchically until hypotheses can be bound to particular code segments. In this process, hypotheses are frequently revised or replaced by more credible ones.
The paper concludes with a discussion of factors affecting comprehension, including the nature of the problem, available documentation, programmer knowledge (both of programming and of the problem domain), reading goals, and reading strategy.
Although not overtly based on empirical studies, Brooks's model seems both sensible and serviceable. The model is somewhat dogmatic about programs being read top-down, however; and although the author acknowledges bottom-up strategies, he dismisses them as less powerful and less important.
This is an important, well-written paper. It is particularly concerned with the process of comprehension, however, and the reader wishing to better understand what it means to understand a program should look at [Brooks78].

[Brooks90] R. Brooks, "Categories of programming knowledge and their application", International Journal Man-Machine Studies, vol. 33, pp.241-246, 1990.

A brief high level overview of the topic. This short paper is an introduction to a special issue of the journal dedicated to this topic.

C

[Carroll97] J. M. Carroll, "Human-computer interaction: psychology as a science of design", Int. J. Human-Computer Studies, vol. 46, pp. 501-522, 1997.

Abstract: Human-computer interaction (HCI) is the area of intersection between psychology and the social sciences, on the one hand, and computer science and technology, on the other. HCI researchers analyze and design-specific user interface technologies (e.g. three dimensional pointing device, interactive video). They study and improve the processes of technology development (e.g. usability evaluation, design rationale), They develop and evaluate new applications of technology (e.g. computer conferencing, software design environments). Through the past two decades, HCI has progressively integrated its scientific concerns with the engineering goal of improving the usability of computer systems and applications, thus establishing a body of technical knowledge and methodology. HCI continues to provide a challenging test domain for applying and developing psychology and social science in the context of technology development and use.
NOTE: This paper reviews the history of HCI in terms of a series of steps toward a science of design. While it briefly mentions issues related to programming environments, the focus is on user interfaces. It takes a dim view of the cognitive modeling approach and views HCI as a sociological design exercise. In other words, he refers to a "science of design" but by downplaying the role of models, he downplays the scientific aspects of HCI. The paper provides a useful historical perspective that differs from the analysis provided by cognitive psychologists. For that reason alone, this paper is useful to read.

[Chi81] M.T.H. Chi et al. 1981; "Categorization and representation of physics problems by experts and novices", Cognitive Science, vol. 5, 121-152, 1981.

[Curtis86] B. Curtis, "By the way, did anyone study any real programmers?", Proceedings of the empirical studies of programmers conference, p. 256, 1986.

Abstract: the relevance of the current empirical research on programming to the pressing problems of software development is challenged. A review of research since 1980 shows a trend toward greater methodological rigor. However, at the same time, most studies concentrate on novice programming, and fail to offer guidance in developing advanced software development environments. Several crucial questions are posed for future empirical research on programming and two exploratory studies under say in the author's laboratory are described.
Note: this is a very important paper. It builds the case for working on real as opposed to student programmers. While not a major point of the paper, it also provides a useful, short summary of cognitive psychology applied to software development.

[Curtis89] B. Curtis, "Five paradigms in the psychology of programming", in M. Helander (Ed.), Handbook of Human-Computer Interaction, 1989.

Note: OSU does not have this book.

D

[Davis90] S.P. Davies, "The nature and development of programming plans", International Journal of Man-Machine Studies, vol. 32, #4, pp. 461-481, 1990.

Abstract: The notion of the programming plan as a description of one of the main types of strategy employed in the comprehension of programs is now widely accepted to form an adequate basis for an account of programming knowledge. Such plans are thought to be used universally in all programming languages by expert programmers. Recent work, however, has questioned the psychological reality of such plans and has suggested that they may be artifacts of the particular programming language used and the structure that it imposes on the programmer via the constrains of certain features of its notation. This paper considers the result of two experimental studies that suggest that the development and use of programming plans is strongly tied to the particular learning experience of the programmer. It is argued that programming plans cannot be considered solely to be natural strategies that evolve independently of teaching nor as mere artifacts of static properties of a particular programming language. Rather, such plans can be seen to be related to the expression of design-related skills. This has a number of important implication for our understanding of the nature and development of programming plans, and in particular, it appears that the notion of the programming plan provides too limited a view to adequately and straightforwardly explain the differences between novice and expert programmer performance.

[Davis90] S.P. Davis, "Plans goals and Selection files in the comprehension of computer programs", Behavior and Information technology, vol. 9 #3 pp. 201-214, 1990.

This paper suggests that plans are not sufficient to explain novice/expert differences. We also need to consider the selection rules used to choose plans and how plans are specialized to fit a particular problem.

[Davis91] S.P. Davis, "Characterizing the program design activities: neither strictly top down nor globally opportunistic", Behavior and Information technology, vol. 10, pp. 173-153, 1991.

[Davies93] S.P. Davies, "Models and theories of programming strategy", Int. J. Man-machine Studies, vol. 39, pp. 237-257, 1993.

Abstract: Much of the literature concerned with understanding the nature of programming skill has focused explicitly upon the declarative aspects of programmers' knowledge. This literature has sought to describe the nature of stereotypical programming knowledge structures and their organization. However, on major limitation of many of these knowledge-based theories is that they often fail to consider the way in which knowledge is used or applied. Another strand of literature is less well represented. This literature deals with the strategic elements of programming skill and is directed towards an analysis of the stages commonly employed by programmers in the generation and the comprehension of programs. In this paper an attempt is made to unify various analysis of programming strategy. This paper presents a review of the literature in this area, highlighting common themes and concerns and proposes a model of strategy development which attempts to encompass the central findings of previous research in this area. It is suggested that many studies of programming strategy are descriptive and fail to explain why strategies take the form they do or to explain the typical strategy shits which are observed during the transitions between different levels of skill. This paper suggest that what is needed is an explanation of programming skill that integrates idea about knowledge representations with a strategic model, enabling one to make predictions about how change in knowledge representation might give rise to particular strategies and to the strategy changes associated with developing expertise. This paper concludes by making a number of brief suggestions about the possible nature of this model and its implications for theories of programming expertise.
Note: the point of this paper is that the nature of programming expertise includes knowledge based and strategic components. In other words, its not just how the know

[Davies89] Davies, S. P. (1989) Skill Levels and Strategic Differences in Plan Comprehension and Implementation in Programming, in Proceedings of the HCI'89 Conference on People and Computers V, Cognitive Ergonomics, pp. 487-502.

Abstract: A number of authors have proposed that the 'programming plan' be regarded as the major characteristic of programming expertise. Such plans are thought to represent the programmer's knowledge of generic and stereotypic fragments of programs that correspond to specific task goals or sub-goals. A range of empirical studies have been undertaken in order to provide support for the notion of the programming plan and to establish the relationship between such plans and expertise. Most of these studies, however, have been concerned with what might be characterized as a theory of plans rather than with a theory of 'planning' in programming. Such studies have tended to examine only the static elements of plans -- attempting to show merely, for example, that plans are related in some way to expertise or to the notation of a particular language, rather than providing a means of looking at the way in which plans might be combined or refined. In addition they have neglected to examine the question of whether plans are implemented differently in different circumstances or with respect to different skill levels. This paper considers the results of two experimental studies which suggest that the relationship between programming plans and expertise is by no means straightforward. This work highlights the need to examine strategic differences in plan generation and comprehension that exist at different skill levels. In conclusion some tentative requirements are proposed for a theory of 'planning' in programming.

[Davies90a] S.P. Davies, "The Nature and Development of Programming Plans." International Journal of Man-Machine Studies, vol. 32, no. 4, pp. 461-481, 1990.

Abstract: The notion of the programming plan as a description of one of the main types of strategy employed in the comprehension of programs is now widely accepted to form an adequate basis for an account of programming knowledge. Such plans are thought to be used universally in all programming languages by expert programmers. Recent work, however, has questioned the psychological reality of such plans and has suggested that they may be artifacts of the particular programming language used and the structure that it imposes on the programmer via the constraints of certain features of its notation. This paper considers the results of two experimental studies that suggest that the development and use of programming plans is strongly tied to the particular learning experience of the programmer. It is argued that programming plans cannot be considered solely to be natural strategies that evolve independently of teaching nor as mere artifacts or static properties of a particular programming language. Rather, such plans can be seen to be related to the expression of design-related skills. This has a number of important implications for our understanding of the nature and development of programming plans, and in particular, it appears that the notion of the programming plan provides too limited a view to adequately and straightforwardly explain the differences between novice and the expert's programming performance.

{Davies90b] S.P. Davies, "Plans, Goals and Selection Rules in the Comprehension of Computer Programs. Behavior and Information Technology," vol. 9, no. 3, pp. 201-214, 1990.

Abstract: The notion of the programming plan has been proposed as a mechanism through which one can explain the nature of expertise in programming. Soloway and Ehrlich (1984) suggest that such expertise is characterized by the existence and use of programming plans. However, studies in other complex problem-solving domains, notably text editing, suggest that expertise is characterized not only by the possession of plan-related structures but also by the development of appropriate selection rules which govern the implementation of plans in appropriate situations (Card et al. 1980, Kay and Black 1984, 1986). This paper presents an experimental study which examines the role of programming plans in the context of skill development in programming. The results of this study suggest that plan-based structures cannot be used in isolation to explain novice/expert differences. Indeed, such structures appear to prevail at intermediate levels of skill. The major characteristic of expertise in programming would appear to be strongly related to the development of appropriate selection rules and to so-called program discourse rules. This in turn suggests that current views on the role of plan-based structures in expert programming performance are too limited in their conception to provide an adequate basis for a thorough analysis of the problem-solving activity in the programming domain.

[Davies93a] S.P. Davies, "Models and Theories of Programming Strategy." International Journal of Man-Machine Studies, vol. 39, no. 2, pp. 237-267, 1993.

Abstract: Much of the literature concerned with understanding the nature of programming skill has focused explicitly upon the declarative aspects of programmers' knowledge. This literature has sought to describe the nature of stereotypical programming knowledge structures and their organization. However, one major limitation of many of these knowledge-based theories is that they often fail to consider the way in which knowledge is used or applied. Another strand of literature is less well represented. This literature deals with the strategic elements of programming skill and is directed towards an analysis of the strategies commonly employed by programmers in the generation and the comprehension of programs. In this paper an attempt is made to unify various analyses of programming strategy. This paper presents a review of the literature in this area, highlighting common themes and concerns, and proposes a model of strategy development which attempts to encompass the central findings of previous research in this area. It is suggested that many studies of programming strategy are descriptive and fail to explain why strategies take the form they do or to explain the typical strategy shifts which are observed during the transitions between different levels of skill. This paper suggests that what is needed is an explanation of programming skill that integrates ideas about knowledge representation with a strategic model, enabling one to make predictions about how changes in knowledge representation might give rise to particular strategies and to the strategy changes associated with developing expertise. This paper concludes by making a number of brief suggestions about the possible nature of this model and its implications for theories of programming expertise.

[Davies93b] S.P. Davies, "Externalizing Information During Coding Activities: Effects of Expertise, Environment and Task." In C. R. Cook, J. C. Scholtz, & J. C. Spohrer (eds.), Empirical Studies of Programmers: Fifth Workshop. Norwood, NJ: Ablex Publishing. pp. 42-61, 1993.

Abstract: This paper presents empirical evidence for differences in the nature of problem solver's information externalization strategies. Two experiments concerned with programming behavior are reported which suggest that experts tend to rely much more upon the use of external memory sources in situations where the device they use to construct the program hinders the utilization of a display in the service of performance. Experts and novices also appear to externalize different kinds of information during problem solving. Hence, experts tend to externalize low level information, mainly to aid simulation, whereas novices develop higher level representations which might be characterized as transformations or re-representations of the problem state. Moreover in the case of experts, the nature of externalized information appears to depend upon whether they are generating a program as opposed to comprehending it. These results provide support for a display-based view of problem solving. Moreover these studies address strategic differences in the externalization of information, which until now have remained unexplored in accounts of display-based behavior. Finally, the paper suggests a number of implications for the design of tools intended to support the programming process and for systems aimed at teaching programming skills.

[Davis84] J.S. Davis, "Chunks: A Basis for Complexity Measurement. Information Processing and management". Vol. 20, nos. 1-2, pp. 119-127, 1984.

Abstract: The state of the art in psychological complexity measurement is currently at the same stage as weather forecasting was when early Europeans based their predictions on portents of change. Current direct measures of program characteristics such as operator and operand counts and control flow paths are not based on convincing indicators of complexity. This paper provides justification for using chunks as a basis for improved complexity measurement, describes approaches to identifying chunks, and proposes a chunk-based complexity measure.
Comments D&N: This paper focuses more on the uses of the abstraction operation of chunking as a means of measuring program complexity than on how to extract chunks from programs. The chunks Davis is concerned with are at the level of Letovsky's and Soloway's plans [Letovsky86b]. Some of the earliest studies of chunking examined chess players. Experiments show that master players can remember more than novices can from a quick scan of a chessboard if the chessboard represents a meaningful situation. If, however, the pieces are randomly arranged, non-master players do as well as chess masters. In the programming world, chunks can be thought of as patterns of statements that accomplish a particular task. For example, experienced programmers may recognize the familiar pattern of a sorting algorithm. Davis reports that the Raytheon Company found that about half the code in its inventory of COBOL programs was redundant, in the sense that similar code existed to perform essentially the same function.
The paper proposes two chunk-based complexity measurement models and reports on comprehension experiments aimed at validating proposed metrics. Davis points out that programmers often maintain the same piece of code over a long period of time. Comprehension experiments that present subjects with unfamiliar programs may therefore be less relevant to the maintenance task than it might at first appear.

[Deimel90] Deimel and Naveda, Reading Computer Programs: Instructors Guide and Exercises", CMU technical report CMU/SEI-90-EM-3, 1990.

[Detienne90] F. Detienne, "Expert Programming Knowledge: A schema-based Approach", in [Hoc90], pp. 205-222, 1990.

E

[Ehrlich84] K. Ehrlich, E. Soloway. "An empirical investigation of the tacit plan knowledge in programming", In J.C. Thomas, and M.L. Schneider (Eds.), Human Factors in Computer Systems, 1984.

F

[Fix93] V. Fix, S. Widenbeck, J. Scholtz. Mental Representations of Programs by Novices and Experts. INTERCHI'93 Conference Proceedings (Amsterdam, The Netherlands), ACM, pp. 74-79, 1993.

Abstract: This paper presents five abstract characteristics of the mental representation of computer programs: hierarchical structure, explicit mapping of code to goals, foundation on recognition of recurring patterns, connection of knowledge, and grounding in the program text. An experiment is reported in which expert and novice programmers studies a Pascal program for comprehension and then answered a series of questions about it designed to show these characteristics if they existed in the mental representations formed. Evidence for all of the abstract characteristics was found in the mental representations of expert programmers. Novices' representations generally lacked the characteristics, but there was evidence that they had the beginnings, although poorly developed, of such characteristics.
Note: Also see [Wiedenbeck93]

G

[Gilmore90] D.J. Gilmore, "Expert Programming Knowledge: a strategic Approach" in [Hod90], p. 223-234, 1990.

Gilmore argues that it's the strategies employed by a programmer - not just their plans - that distinguishes an expert programmer from a novice. It's not that he disagrees with the plan theory, it's just that it's incomplete. Plans emphasize expert behavior in terms of knowledge acquisition; i.e. they have more expertise because they've learned more plans. Gilmore thinks it goes well beyond knowledge. Experts also possess strategies that serve them in new situations.

[Green93] T.R.G. Green and M. Petre, "Cognitive dimensions as discussion tools for programming language design", Human-Computer Interaction, some-date-after 1993.

This is the primary reference for Green's cognitive dimensions.

[Green96] T.R.G. Green and M. Petre, "Usability Analysis of Visual Programming Environments: a 'cognitive dimensions' framework", J. Visual Languages and Computing, vol.7 pp. 131-174, 1996.

In this paper, Green's "cognitive dimensions" framework [Green?] is applied to a few visual programming environments. Since we aren't primarily concerned with visual programming languages, the major use of this paper is as an example of the application of the cognitive dimensions framework. This framework (still under development when this paper was written) consists of 12 relatively independent features that characterize programming languages. Some of these features are:
Green sees the value of this framework as a vehicle for discussion. He then goes on to describe tradeoffs between these features for visual programming languages.

[Green96b] T.R.G. Green, "The Visual Vision and Human Cognition", an invited talk at Visual Languages'96, available at http://www.ndirect.co.uk/~thomas.green/workStuff/VL96Talk/VLTalk.html, 1996.

Contains a long description of Green's cognitive dimensions.

[Green97] T.R.G. Green, "Cognitive Approaches to Software Comprehension: Results, Gaps, and Limitations", a talk at the workshop on Experimental Psychology in Software Comprehension Studies 97, University of Limerick, Ireland, www.ndirect.co.uk/~thomas.green/, 1997.

This is essentially an extended abstract. It very briefly reviews recent developments in program comprehension. His closing quote in this paper is important to keep in mind: "The way forward is not to make strong, simple claims about how cognitive processes work. The way forward is to study the details of how notations convey information."

[Gugerty86] L. Gugerty and G.M. Olson, Debugging by skilled and novice programmers, Proceedings CHI'86: Human Factors in Computing systems, ACM, 1986.

[Guindon90] R. Guindon, "Knowledge exploited by experts during software system design", Int. J. Man-machine Studies, vol. 33, pp. 279-304, 1990

Abstract: High-level software design is characterized by incompletely specified requirements, no predetermined solution path, and by the integration of multiple domains of knowledge at various levels of abstraction. The application of data-driven knowledge rules characterizes expertise. A verbal protocol study describes these domains of knowledge and how experts exploit their rich knowledge during design. It documents how designers heavily rely on problem domain scenario simulations throughout solution development. These simulations trigger the inferences of new requirements and complete the requirement specification. Designers recognize partial solutions at various levels of abstraction in the design decomposition through the specification of data-driven rules. Designers also relay heavily on simulations of their design solutions, but these are shallow, that is, limited to one level of abstraction in the solution. The findings also illustrate how designers capitalize on design methods, notations, and specialized software design schemas. Finally, the study describes how designers exploit powerful heuristics and personalized evaluation criteria to constrain the design process and select a satisfactory solution. Studies such as this one help map the road to understanding expertise in complex tasks.

H

[Hoc90] J.-M. Hoc, T.R.G. Green, R. Samurcay and D.J. Gilmore (eds.), Psychology of Programming, Academic Press Ltd., 1990.

An excellent review of psychology applied to computer programming.

[Holt87] R.W. Holt, D. A. Boehm-Davis, A.C. Shultz. "Mental representations of programs for student and professional programmers", Empirical Studies of Programmers, 2nd workshop, pp. 33-46, 1987.

Abstract: This research examined programmers' cognitive representations of software. In this study, student and professional programmers were asked to make either simple or complex modifications to the program that had been generated using each of three different design methodologies: in-line code functional decomposition, and a form of object-oriented design. The programmers' mental models of the programs they had studied were elicited and then scored in several different ways. The results suggest that problem structure, problem type, and ease of modification may affect the mental models formed. Specifically, the data suggest that while the mental models of professional programmers were affected primarily by modification difficulties, the mental models of student programmers were primarily affected by the structure and content of the programs. Performance differences between the two groups of programmers were small because the experience variables which were most strongly related to performance were nearly equal in the two groups, and the experiences variables which were very different between the two groups were not related to performance. Across the two groups, the primary aspect of the mental model which was correlated with performance variables was the width and breadth of the mental model structure. The implications of the results for the application of program design methodologies are discussed.

J

[Jeffries81] R. Jeffries et. al.: The processes involved in designing software", In J.R. Anderson (Ed.), Cognitive skills and the Acquisition, pp. 255-283, 1981.

[Johnson-Laird83] P.N. Johnson-Laird," Mental models. London, Cambridge University Press, 1983.

K

[Kann97] C. Kann, An Exploratory Empirical study of the Role of Program Generation, Strategies and Plans in Concurrent Programming, Ph.D. Dissertation, The George Washington University, Washington D.C., 1997.

[Kann98] C. Kann, "Teaching Concurrent Programming in Ada", The George Washington University, Washington D.C., 1998.

[Kao97] D. Kao, N. P. Archer, "Abstraction in conceptual model design", int. J. Human-Computer studies, vol. 46, pp. 125-150, 1997.

Abstract: Understanding and supporting conceptual model design is an important issue in model Management system (MMS) research. While there are a broad range of aspects which affect conceptual design, this study focuses on the use and support of abstraction in the design process. We classify the use of abstraction in design into three categories: vertical, horizontal and general abstraction techniques. We then propose a theoretical framework which suggests the completeness of the design, the development of higher level concepts in design, and the design organization as the three dimension of design output that can be enhanced by effective use of these abstraction techniques. The proposed framework was empirically tested on a design problem using non-domain expert, with software prototype that provided abstraction aids. The findings indicated significant effects of abstraction aides on the three domains of design output. Specifically, training exercises with comprehensive examples of various design strategies significantly improved both the number of high level ideas generated and the design organization compared to unaided designs. The completeness of designs was enhanced by both the design environment structure and the examples and analogies provided during training. The implications of this study are; (a) it is possible to measure the impact of abstraction support on the conceptual design process, (b) the proposed measures can be used in the development and evaluation of design support systems, and (c) abstraction support can significantly improve the quality of design by non-domain experts.
Note: this paper explored ways to aid the process of model design and in that sense it might help u out should be move in that direction. The problem domain was well removed from software development, but the application to software development was clear. They basically empirically showed that tools that helped designers build abstractions were useful. The designers in this studied benefited from training tools that provided examples to help them reason by analogy.

[Kaplan86] S. Kaplan et. al., "The Components of Expertise: A cross-Disciplinary Review, Ann Arbor, MI: The university of Michigan, 1986.

[Knuth84] D.E. Knuth,. Literate Programming. Computer J. vol. 27, no. 2, 97-111, May 1984.

Comments D&N: Knuth describes his WEB system for programming and documentation. Anyone with a deep interest in the system should read this paper, but the reader who would prefer a brief, lucid description of this interesting system (and philosophy) should read Jon Bentley's piece [Bentley86a] on the subject instead.

[Koenemann91] J. Koenemann and S. Robertson "Expert Problem Solving Strategies for Problem Comprehension". Proceeding of CHI'91. p. 215-130, 1991.

Abstract: Program comprehension is a complex problem solving process. We report on an experiment that studies expert programmers' comprehension behavior in the context of modifying a complex PASCAL program. Our Data suggests that program comprehension is best understood as a goal-oriented, hypotheses-driven problem-solving process. Programmers follow a pragmatic as-needed rather than a systematic strategy, they restrict their understanding to those parts of a program they find relevant for a given task, and they use bottom-up comprehension only for directly relevant code and in cases of missing, insufficient, or failing hypotheses. These findings have important consequences for the design of cognitively adequate computer-aided software engineering tools.

L

[Lee94] A. Lee, N. Pennington, " The effects of paradigm on cognitive activities in design", Int. J. Human-Computer Studies, vol. 40, pp. 577-601, 1994.

Abstract: This research examines differences in cognitive activities and final designs among expert designers using object-oriented and procedural design methodologies, and among expert and novice object-oriented designers, when novices have extensive procedural experience. We observed, as predicted by others, a closer alliance of domain and solution spaces in object oriented design compared to procedural design. Procedural programmers spent a large proportion of their time analyzing the problem domain. In contrast, object-oriented designers defined objects and methods much more quickly and spent more time evaluation their designs through simulation processes. Novices resembled object-oriented experts in some ways and procedural experts in others. Their designs had the general shape of the object-oriented experts' designs, but retained some procedural features. Novices were very inefficient at defining objets, going though an extensive situation analysis first, in a manner similar to he procedural experts. Some suggestions for instruction are made on the basis of novice object oriented designers difficulties.

[Letovsky86a] S. Letovsky, "Cognitive Processes in Program Comprehension". In Empirical Studies of Programmers: Papers Presented at the First Workshop on Empirical Studies of Programmers, June 5-6, 1986, Washington, D.C., Elliot Soloway and Sitharama Iyengar, eds. Norwood, N.J.: Ablex, 1986, 58-79. Reprinted in J. Syst. and Software 7, 4), 325-339, Dec. 1987.

Abstract: This paper reports on an empirical study of the cognitive processes involved in program comprehension. Verbal protocols were gathered from professional programmers as they were engaged in a program understanding task. Based on analysis of these protocols, several types of interesting cognitive events were identified. These include asking questions and conjecturing facts about the code. We describe these event types, and use them to derive a computational model of the programmers' mental processes. Letovsky refers to a study involving the videotaping of six professional programmers as they enhanced a FORTRAN 77 program of about 250 lines. (The same study is also the basis for [Letovsky86b] and [Littman86].) Subjects were asked to think aloud as they worked. The author describes and analyzes what they said as they labored to understand the program to be modified. He presents a cognitive model of program understanding composed of the programmer's knowledge base, a mental model, the construction of which is the ultimate goal of program reading, and an assimilation process by which the programmer actually builds the mental model. Most of the paper is concerned with the assimilation process and the empirical data justifying the author's analysis of it.
Comments D&N: Although Letovsky's language often differs from that of Brooks, his cognitive model of program comprehension is basically consistent with and elaborates the model in [Brooks83]. Whereas Brooks emphasizes top-down approaches to reading programs, Letovsky offers convincing evidence that programmers work both top-down and bottom-up. Much of the paper is devoted to analysis of the questions, conjectures, and inquiries made by the programmers while reading the code.

[Letovsky86b] S. Letovsky, and E. Soloway. "Delocalized Plans and Program Comprehension". IEEE Software vol. 3, #3. 41-48, May 1986.

Comments D&N: The authors conclude, based on the same study as [Letovsky86a], that inadequately documented delocalized plans are sometimes responsible for misreading of programs on the part of maintainers. They analyze comprehension failures by their subjects and suggest techniques to prevent such misunderstandings when composing programs.
According to this paper, the task of understanding a program is one of uncovering the intention behind the code. Intentions are described as goals. Techniques for realizing goals in a particular implementation are called plans. Plans are a lot like algorithms, but they may involve non-contiguous elements and may be combined in ways we do not usually consider for algorithms. Two plans involving loops may be combined into a solution using a single loop implementing two distinct goals, for example.
The authors have observed that readers of programs tend to infer the goals of code fragments on the basis of locally available information. If the plan for a fragment is delocalized, that is, part of the plan is realized in non-contiguous code, the reader will often incorrectly perform this inference. The authors suggest various documentation techniques to mitigate reading problems resulting from delocalized plans, most of which require the programmer to be more explicit in comments about his intentions. The paper also includes a brief section on related work and tools for assisting program reading.
The comprehension difficulties discussed here are not surprising ones, yet the paper comes as something of a revelation to most of us who have never thought much about those difficulties or have never thought about them so clearly.

M

[Minsky75] M. Minsky. "A framework for representing knowledge", in P.H. Winston (Ed.) The psychology of computer vision, McGraw Hill 1975.

O

[Ormerod90] T. Ormerod, "Human Cognition and Programming", in [Hoc90], pp.63-82, 1990.

P

[Pennington87] N. Pennington, "Comprehension Strategies in Programming." In Empirical Studies of Programmers: Second Workshop, G. M. Olson, S. Sheppard, and E. Soloway, eds. Norwood, NJ: Ablex. p. 100-113. 1987.

Abstract: This report focuses on differences in comprehension strategies between programmers who attain high and low levels of program comprehension. Comprehension data and program summaries are presented for 40 professional programmers who studied and modified a moderate length program. Illustrations from detailed think-loud protocol analyses are presented for selected subjects who displayed distinctive comprehension strategies. The results show that programmers attaining high levels of comprehension tend to think about both the program world and the domain world to which the program applies while studying the program. We call this a cross-referencing strategy and contrast it with strategies in which programmers focus on program objects and events or on domain objects and events, but not both.
Comments D&N: Based on her previous research, Pennington proposes that understanding of overall program flow control precedes the more detailed understanding of program functions. In particular, she suggests that program readers build at least two mental models of the program they are studying, a program model and a domain model. The program model is characterized by an abstract knowledge of the program's text structures. The domain model relates objects and functions in the problem domain to source-language entities.
The author carried out an experiment using a minimally documented 200-line FORTRAN program. Subjects were asked to study the program for 45 minutes in preparation for a modification task. Some of the subjects were asked to think aloud as they examined the program. After the study period, subjects wrote summaries explaining what the program did and answered 20 comprehension questions. They were given an additional 30 minutes to implement the requested change, after which a second summary was written and 20 more comprehension questions answered. Using her analysis of the data, Pennington asserts that the comprehension strategies of the subjects can be characterized as program-level, domain, or cross-referencing, the latter being a strategy that combines features of the other two. That is, the programmers concentrated either on the program, on the problem domain, or somehow effectively related the two. Not surprisingly, it was the cross-referencing readers who performed best.
Whether or not Pennington's results indicate that program readers create two distinct mental models in succession, they certainly support the layered abstractions proposed by Brooks [Brooks83] and Letovsky [Letovsky86a]. This is an insightful paper discussing the cognitive process of program comprehension. It is equally interesting on methodological grounds.

[Pennington87] N. Pennington, "Stimulus structures and mental representations in expert comprehension of programs", Cognitive Psychology, Vol. 19, pp. 295-341, 1987

In this important paper, Pennington proposes five types of information programmers extract as they comprehend a program. She proposes what is basically a bottom-up process starting with the control flow and culminating in forming the "situation model"

[Petre88] M. Petre and R.L. Winder, "Issues governing the suitability of programming languages for programming tasks. "People and Computers IV: Proceedings of HCI-88, Cambridge University Press, 1988.

[Petre90] M. Petre, "Expert Programmers and Programming Languages", in [Hoc90], p. 103, 1990.

R

[Rasmussen85] "The role of hierarchical knowledge representation in decision making and system management", IEEE Transactions on Systems, Man and Cybernetics, SMC-15, p. 243, 1985.

This is the key reference that introduces the concept of an Abstraction hierarchy. Also, take a look at [Bisantz94].

[Rathke94] C. Rathke, and D. Redmiles "Improving the Explanatory Power of Examples by a Multiple Perspectives Representation". Proceedings of the 1994 East-West Conference on Computer Technologies in Education (EW-ED '94). P. Busilovsky, S. Dikareva, J. Greer and V. Petrushin. Crimea, Ukraine, pp. 195-200, 1994.

Abstract: We developed a software tool called EXPLAINER for helping programmers complete new tasks by exploring previously worked-out examples. The implementation is based on the principle of making examples accessible through multiple perspectives and, specifically, perspectives that emphasize the programming plans underlying an example. The initial version of EXPLAINER used a simple, semantic network to represent multiple perspectives. A frame-based knowledge representation language called FrameTalk provides a more structured means of representing examples in EXPLAINER. Moreover, FrameTalk provides mechanisms that avoid deficiencies that arise when concept taxonomies must serve the dual purpose of representing specialization and composition of attributes.

[Redmiles93] D. Redmiles, "Reducing the Variability of Programmers' Performance Through Explained Examples," Human Factors in Computing Systems,INTERCHI'93 Conference Proceedings (Amsterdam, The Netherlands), ACM, 1993, pp. 67-73. 1993

Abstract: A software tool called EXPLAINER has been developed for helping programmers perform new tasks by exploring previously worked-out examples. EXPLAINER is based on cognitive principles of learning from examples and problem solving by analogy. The interface is based on the principle of making examples accessible through multiple presentation views and multiple representation perspectives. Empirical evaluation has shown that programmers using EXPLAINER exhibit less variability in their performance compared to programmers using a commercially available, searchable on-line manual. These results are related to other studies of programmers and to current methodologies in software engineering.

[Rist86] R.S. Rist, "Plans in programming: definition, demonstration and development" in E. Soloway and S. Iyengar (Eds.), Empirical Studies of Programmers, Norweed, NF, Ablex, 1986.

ABSTRACT: Support for the use of plans in cognitive models of program s was provided by evidence from both novice and expert programmers. For novice programmers, an initial plan-based description of code segments was replaced by syntactic and control based groups as the programs became m ore complex. An increase in plan use with expertise was also evident. Experts used only plan groupings in their efforts to understand a program
Cluster analysis of these code groupings showed the precise definition and order of appearance of the program plans, providing an experimental basis for identifying cognitive plans. Three main source of plan emergence were identified in the novice data: goal based, object based and basic plans. Goal based plans follow the focal segment for a goal or program to be easily identified. The Link between the program goal and the code that implements this goal provides the basic plan structure of the program. The rest of the program code supports this basic operations. The conceptual model of the program for experts is centered on this focal segment. This directs attention in the understanding and construction of programs. It gives a human solution to the problems of goal search and selection.
NOTES: This key paper reports on empirical investigations of the plan structure used by novice (high school summer students at Yale) and expert programmers (third year graduate students) The study looked at comprehension of Pascal programs. The subjects were asked to logically group statements in a series of 12 short programs (22 to 42 lines of code). The way statements were grouped and organized into plans was analyzed. Experts reasoned in terms of plans and groups of plans. The overall program structure was deduced by plan based goal chaining. Novices and experts when encountered with unfamiliar code, used syntactic or control-flow based descriptions of the code. As they better understood the code, it was mentally reorganized in terms of specific sub-goals and plans. In other words, plan use was correlated with expertise and inversely correlated with code difficulty.
Plan knowledge appeared from three main sources. (1) Global plans (or gplan) were the first plans taught and used by novices and experts similarly. (2) Focal plan descriptions appeared in both novice and expert plan trees. (3) Basic plans define the detailed structure of the program. The expert programmers more effectively narrowed their analysis to Focal plans.
This study directly looked at how code was grouped into plans and therefore provides some insight into how plans are constructed. Four states of code grouping were observed. Code was grouped together in terms of: (1) the syntactic categories to which the code could be assigned. (2) the program control structure, (3) basic plans to organize code, and (4) the role of the plans relative to the program's focus. This last stage resembles top-down analysis and it was not used by novices.

[Rist86b] R. S. Rist, "Focus and learning in program design", Proceedings of the 8th Cognitive Science Conference, Amherst, MA, 1986.

[Rist89] R. S. Rist, "Schema creation in programming", Cognitive Science, Vol. 13, pp. 389-414, 1989.

[Rist90] R.S. Rist, "Variability in program design: the interaction of process with knowledge", International Journal of Man-Machine Studies, Vol. 33, pp. 305-322, 1990.

[Robertson 90] S. P. Robertson and C Yu, "Common cognitive representations of program code across tasks and languages", int. J. Man-machine Studies, vol. 33, pp. 343-360, 1990.

Abstract: Plans are underlying cognitive structures used by programmers to represent code. In two studies, we examined the content of plan-based representations and sought to show that common representations are used for programs that instantiate the same plans, even when the perform different tasks and are written in different languages (Pascal or Fortran). Our results support plan-based models and show that the organizing structures for chunks of code are abstract programming goals. The same abstract structures are formed for programs that perform different tasks using the same plans and for programs written indifferent languages but using the same plans. While plans were the primary organizing structures for code representation, other task-related information also played a role suggesting that programmer really utilize multiple representations. We advocate viewing code comprehension more like a plan recognition process and less like a text comprehension process.

[Robbins96b] J.E. Robbins, and D. F. Redmiles "Software Architecture Design From the Perspective of Human Cognitive Needs". 1996.

Abstract: Much attention in software engineering research today is focussed on the notion of software architectures. The major motivation is that software architectures provide the appropriate level of abstraction to support the design of complex systems. The research has quickly evolved to the degree that design environments have been implemented to support software architects in creating new designs by combining components within architectural styles. We follow the same motivation with a different focus. We report on a software architecture design environment called Argo. Argo differs from other approaches by being paying attention to the human, cognitive needs software architects have during design as much as the representation and manipulation of the architecture itself. We emphasize the primary considerations by contrasting an analysis of the human, cognitive design process with a systems, software design process. The corresponding, key elements are illustrated through a design scenario with Argo. Human-centered features in Argo focus on the application of critics for providing design feedback, design processes for supporting critics, and multiple architectural perspectives for aiding human designers.
Comments UMaD: The paper discusses the authors' rationale in designing and building a design environment, Argo. They claim to concern cognitive design theory in the design of the environment. In particular, the authors contend they support reflection-in-action (Schön), opportunistic design (Guindon) and comprehension and problem solving (through the use of multiple representations). The mapping of these theories to the particular environment is useful. The notion of design critic in the system is supported through noticing what cause breakdowns (lack of domain knowledge, lack of solution knowledge, lack of good process understanding) and how designers handle breakdowns during the design process. With an active critic information regarding rule violations or suggestions can be incorporated early in the process. Also, in support of opportunistic design, the system helps maintain not only the current representation of the system under construction but also a representation of the process the designer is using. By aiding the designer in managing the notion of process it become easier for the individual to manage the volume of to do activities.

S

[Schank77] R.C. Schank and R.P. Abelson, Scripts, Plans, Goals and Understanding, Lawrence Erlbaum Associates, 1977.

[Shneiderman79] B. Shneiderman, and R. Mayer. "Syntactic Semantic Interactions in Programmer Behavior: A Model and Experimental Results." Intl. J. Comp. and Info. Sciences vol. 8, #3, pp. 219-238, 1979.

Comments D&N: This paper presents a cognitive framework for describing behaviors involved in program composition, comprehension, debugging, modification, and the acquisition of new programming concepts, skills, and knowledge. An information processing model is presented which includes a long-term store of semantic and syntactic knowledge, and a working memory in which problem solutions are constructed. New experimental evidence is presented to support the model of syntactic/semantic interaction. The authors present their cognitive model of programmer behavior, the syntactic/ semantic model. They suggest that this model is useful in explaining a variety of behaviors, including program reading and program writing. The authors hypothesize that programmers retain both semantic and syntactic knowledge in long-term memory, and that they use short-term and working memories in performance of various program-related tasks. Semantic knowledge and syntactic knowledge are largely independent in this model. Semantic knowledge is multilayered and substantially language-independent; syntactic knowledge applies to particular programming languages. Shneiderman and Mayer describe how their model applies to program reading, program writing, debugging, and learning programming languages. They conclude their paper with brief discussions of experiments that they offer as supporting evidence for their theory.
In program comprehension, according to this theory, the reader constructs a multileveled internal semantic structure to represent the program, a process of encoding from the program syntax, which is not memorized directly. The internal structure is built by recognizing the function of program components and fragments as chunks. These pieces are then aggregated until a description of the entire program is available.
This is a paper everyone should read. It presents a typical cognitive model in an approachable way, and shows how such models are used and verified. It also offers insight into programmer behavior. Yet, the structural complexity of the syntactic/semantic model makes the model seem less useful than it should be, primarily because a totally adequate model would be very much richer in processing details. The processes reified in this model are largely implicit in other comprehension models. Shneiderman's and Mayer's mental model of a program is quite similar to that of Brooks [Brooks78] and Letovsky [Letovsky86a]. Their description of the assimilation process, however, is strictly bottom-up.

[Shneiderman80] B. Shneiderman, Software Psychology: Human Factors in Computer and Information Systems. Cambridge, Mass.: Winthrop, 1980.

Comments D&N: Software Psychology is a handbook for the application of psychology to computer-related issues. Shneiderman provides a crash course on methods of psychological research and proceeds to discuss topics from program reading to team organization and the design of interactive systems. Although this volume was written a decade ago, it remains an invaluable reference on psychological factors related to the computer. The book contains an extensive bibliography.

[Soloway82] E. Soloway, K. Ehrlich, J. Bonar, J. Greenspan, "What do novices know about programming?" In A. Badre and B. Shneiderman (Eds.) Directions in Human-Computer Interaction, Ablex, 1982.

[Soloway84] E. Soloway, K. Ehrlich, "Empirical studies of programming knowledge" IEEE Transactions on Software Engineering, vol. 10 pp. 595-609.

U

[Upchurch] R. Upchurch, "Annotated Bibliography: Code Reading and Program Comprehension". Available on the World Wide Web at ...

V

[vonMayrhauser95] A. von Mayrhauser, and A. M. Vans. " Program Understanding: Models and Experiments." In M. Yovits & M. Zelkowitz (eds.), Advances in Computers, Vol. 40. San Diego: Academic Press. p. 1-38, 1995.

Abstract: Models of how programmers understand code they have not written have been developed and analyzed for many years. These models describe program comprehension at various levels of detail. This paper puts them in perspective, particularly with regard to specialized maintenance tasks versus general code understanding needs. Experiments support some, but not all, comprehension models. We analyze models and their validation experiments to see what the current state of knowledge about program comprehension offers. Open issues point to a need for experimental studies with experienced software engineer working on specific maintenance tasks and large-scale code in state-of-the-art environments.
Comments UMaD: Perhaps the best source for a complete assessment of the state of program comprehension research. This book chapter attempts to organize the field and elaborate on empirical findings that validate which of the suggested models. The research reviewed appears to be rather complete. The authors review the major camps (6), and provide an overview of the theoretical position and empirical support. Also included is the authors' metamodel for program comprehension that offers single model that incorporates both top-down and bottom-up comprehension strategies.

[Visser90] W. Visser, "More or less following a plan during design: Opportunistic deviations in specification", International Journal of Man-machine studies, 1990.

An observational study was conducted on a mechanical engineer throughout his task of defining the functional specifications for the machining operations of a factory automation cell. The engineer described his activity as following a hierarchically structured plan. The actual activity is in fact opportunistically organized. The engineer follows his plan as long as it is cognitively cost-effective. As soon as other actions are more interesting, he abandons his plan to proceed to these actions. This paper analyses when and how these alternative-to-the-plan actions come up. Quantitative results are presented. Implications of these results for assistance tools are discussed briefly.

[Visser90] W. Visser and J. Hoc, "Expert Software Design Strategies", in [Hoc90], pp. 235-249, 1990.

W

[Weinberg71] G.M. Weinberg, The Psychology of Computer Programming. New York: Van Nostrand Reinhold, 1971.

Comments D&N: Weinberg devotes the first chapter of his well-known book to program reading, remarking ruefully that [e]ven programmers do not read programs. He suggests that there is much to learn from reading both good and bad programs. Most of the chapter is devoted to examples of the factors affecting what actually gets coded: limitations of the machine, the implementation language, and the programmer; historical accidents; and evolving specifications.

[Weiser81] M. Weiser, "Program Slicing". Proc. 5th Int. Conf. on Software Eng. New York: IEEE, pp. 439-449, 1981.

Abstract: Program slicing is a method used by experienced computer programmers for abstracting from programs. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a slice, is an independent program guaranteed to faithfully represent the original program within the domain of the specified subset of behavior.
Finding a slice is in general unsolvable. A dataflow algorithm is presented for approximating slices when the behavior subset is specified as the values of a set of variables at a statement. Experimental evidence is presented that these slices are used by programmers during debugging. Experience with two automatic slicing tools is summarized. New measures of program complexity are suggested based on the organization of a program's slices. Being able to find a program slice simplifies analysis of a program. Even though program slicing cannot be fully automated, the concept of a slice is a useful one.
Comments D&N: Weiser explains slicing by pointing out that, when fixing a bug, an experienced programmer usually focuses only on those parts of the program that may obviously have something to do with the bug in question. Other parts of the program are ignored, effectively having been deleted in the programmer's mind from the code being studied. Programmers apply this same technique when making program improvements or modifications.
The paper considers the slicing of block-structured programs written in a Pascal-like language. A slice must have two desirable properties: (1) it must have been obtained from the original program by statement deletion, and (2) the behavior of the slice must be the same as that of the original program, as observed through the domain of the specified subset of behavior. Characterizations of programs in terms of flow graphs are explained, and meaning is given to a slice within those contexts. To make the problem of finding a program's slice tractable, Weiser introduces a weaker definition of slice and gives sufficient conditions for statement inclusion. Weiser also introduces a number of slice-based complexity metrics and discusses their computation.

[Weiser83] M. Weiser and J. Shertz, "Programming problem representation in novice and expert programmers, International Journal of man-Machine Studies, vol. 19, 391-398, 1983.

[Wiedenbeck86] S. Wiedenbeck, "Beacons in computer program comprehension", International Journal of Man-Machine Studies, vol. 25, pp. 697-709, 1986.

[Wiedenbeck89] S. Wiedenbeck and J. Scholtz, "Beacons: a knowledge structure in program comprehension", In G. Salvendy and M.J. Smith (eds.) Designing and Using Human Computer interfaces and Knowledge-based systems, Amsterdam: Elsevier, 1989.

[Wiedenbeck93] S. Wiedenbeck V. Fix, and J. Scholtz, "Characteristics of the mental representations of novice and expert programmers: an empirical study", International Journal of Man-machine Studies, vol. 39, pp. 793-812, 1993.

This paper presents five abstract characteristics of the a programmer's mental representation of computer programs. (1) it is hierarchical and multi-layered, (2) It contains explicit mappings between the different layers, (3) It is founded on the recognition of basic patterns: (4) It is well connected internally; (5) It is well grounded in the program text. An experiment is reported in which expert and novice programmers studied a Pascal program for comprehension and then answered a series of questions about it. The experiment was designed to show these previously defined characteristics if they existed in the mental representations formed by the test subjects. Evidence for all 5 abstract characteristics was found in the mental representations of expert programmers. Novice representations, however, generally lacked these characteristics, though there was evidence that they had the beginnings, although poorly developed,, of such characteristics.

[Wiedenbeck97] S. Wiedenbeck and J. Scholtz (eds.), Empirical Studies of Programmers: Seventh Workshop, ACM Press, 1997.

Acknowledgements

An important source for this bibliography was the "Code Reading and Comprehension" annotated bibliography which was compiled by Richard Upchurch at Umass Dartmouth. Some of the comments reported in this bibliography were lifted straight out of the Upchurch bibliography. These comments are prefaced by the label "Comments D&N" (in which case they came from [Deimel90]) or "Comments UmaD".