Complexity Reduction in Feature Analysis
In this dissertation, I describe my work on reducing complexity in feature analysis. Features are aspects of a program that are defined by the requirements and implemented in the code. My work addresses three aspects of feature analysis: effort estimation, feature reuse, and comprehension of the underlying code by blind programmers. Correctly estimating the time and effort required to implement or extend a feature avoids cost overruns. Cost estimation tools use feature location algorithms to map requirements to the code that implements them. The literature describes numerous feature location algorithms, but not all are useable in industrial environments. Some algorithms are very complex, imposing training costs and impeding communication. Other algorithms require complex tuning for best results. Still others require hardware support. This level of complexity is discouraged in an industrial environment, where the resource allocations stemming from a feature location algorithm must be justified. Instead, industry requires a balance between complexity and performance rather than performance at any cost.
Once found, features can be reused in other programs to save the cost of reimplementation. The problem with feature reuse is that it requires the programmer to comprehend and copy the dependencies of a feature. Previous work has shown that reusing a single statement requires the programmer to comprehend and copy 30-60% of the original program. In many cases feature reuse is deemed too complex to be practical.
The code comprehension process itself contains a great deal of complexity. This complexity obscures the relationship between features and implementation details that relate to those features. This is especially a problem for blind programmers, who use a screen reader to read code. The screen reader speaks the contents of the screen aloud. Blind programmers cannot skip to the most important code areas in the way sighted programmers can but must read code sequentially one line at a time. Very little work has been done on the precise difference between the ways blind and sighted programmers read code, and comprehend that code in terms of the features of the overarching program.
I address these areas of complexity in four projects presented in this document. I present my research in designing a feature location component for use in a cost estimation system for the United States Navy. I also present Flashback, a library for drastically simplifying feature reuse through record and replay technology previously used for debugging and security. I investigate possible differences between the program comprehension strategies of blind and sighted programmers, and finally present an interface to allow blind programmers to skim code much like sighted programmers.
History
Date Created
2018-04-09Date Modified
2018-11-02Defense Date
2018-03-19Research Director(s)
Collin McMillanDegree
- Doctor of Philosophy
Degree Level
- Doctoral Dissertation
Program Name
- Computer Science and Engineering