17th
2008
Aug
permalink

Code Maintenance Tool Wish-List

At my day-job, I am forced to maintain code that I did not write, that I am not familiar with, and whose authors no longer work at the company. Did you say “documentation”? Hah! No, my friends… it is just me and the editor.

I am a designer. A prototyping programmer. I do my best work when given vague requirements, lots of freedom, and a blank editor. And in my experience, unlike conventional wisdom, people do not know what they want… That’s a whole post in itself, so I’ll leave it at that.

Not only do I dislike maintenance, I am actually not that great at being a maintenance programmer whose job is to modify or add to an existing (usually unfamiliar) code-base. However, as much as I dislike it, I’m reluctant to say that I’ve actually learned a lot from the experience. I hate it so much that my mind is constantly looking for ways to make it easier. Here is what I wish I had.

Specifying and seeing scope more precisely. For example, I should be able to say these 2 functions can call this, or just the class Foo and this class’s testing module, or this is a project entry-point. Actually, what would be better is to have intended scope, and then have your tool tell you what actually refers to it. (To this end, it would help if dynamic lookup were restrictable. Perhaps whether a function can be looked up dynamically should be a metadata tag on the function.) Additionally, I don’t want to have to search every time I want to see the references to a code object. I want to be able to see this in real-time as I move my cursor, click, or even mouseover something in my editor. The way it is now, finding references feels expensive because you have to search and it takes a long time. As a result, I only do it when I feel like I really have to, which is not that often. But if it were cheap, I would think of it differently. Perhaps I would use it to learn how things are related; I’m really not sure. But I know I would use it differently.

Comments and function names should be indistinguishable from log statements, or at least toggle-able. My editor should allow me to toggle logging of function arguments whenever the function is called. I should never have to type log("in myFunction x=" + x + ", y=" + y), yet somehow I always do. It should be as simple as log(x, y), and this could be solved with something as simple as macros that output the arguments that were evaluated, in addition to the values of those arguments.

Inline, implicit functions instead of code “paragraphs” which are merely lines of code separated by blank lines. These implicit functions would show which variables the block is actually touching as parameters to the function (which can be inferred). These new kinds of function-paragraphs could easily be pulled out into actual functions, with the function application automatically generated in that spot of the code.

Viewing code fully inlined, highlighting patterns. Long functions, as opposed to many short functions, are actually easier to read if you’re not familiar with the code. For one, it is fewer levels of indirection. But also, you can think of functions in a module as a mini domain-specific language. For domain experts, it is infinitely easier to use this terminology. However, when you’re new to the field, you can’t understand anyone because they’re all using what seems like cryptic lingo.

Before the year 2000, who in the world knew what a hanging chad was. But the few experts who coined the term must have found it useful. The first time you heard “hanging chad”, you probably asked, “What the hell is a hanging chad? Pregnant chad??” Likewise, a maintenance programmer should be able to view any function calls he pleases   inlined, basically saying, “Instead of using this term, tell me what you mean in the standard terminology that I already know.” Of course, the editor won’t literally inline the code by copying and pasting; it only displays it that way. After all, code is just data. We can view the code as if it were inlined without throwing away the fact that there is a pattern of shared code. The editor could use something akin to the implicit function-paragraphs I described to keep the patterns already created by the original programmer, and at the same time, slowly teach the maintenance programmer the domain lingo.

Oftentimes when a few classes have commonality, they are refactored into an abstract base class containing the common code and several subclasses which implement or re-implement dynamically-dispatched methods. When you understand the flow of all the classes, this is a great way to factor out patterns and reuse code. However, when you’re unfamiliar with the code, it’s damn-near impossible to look at the flow to try to modify it. A method in the base class is possibly used by the derived classes, but not necessarily. Some might use it, but some might not. Others could use it but also add to it by overriding and explicitly calling the overridden method. There is no adequate tool that I know of that will show me the collapsed view of code for a derived class, to see inherited and overridden methods and all — the final result of all the abstraction and reuse. Firebug does this for inspecting CSS.

Firebug shows you the rules that cascade together to style each element. Rules are sorted in the order of precedence, and properties that have been overridden are stricken out. Each rule has a link back to the file where it came from which you can click to jump to the line.

Any web developer will tell you this is the most useful thing in the world; why not do this for methods in an OO programming language…? With a function, at least you can apply it to get its result in specific cases. With a macro, you can expand it. But with abstract classes — abstractions over classes — you’re stuck having to imagine it all yourself. Sure, you can instantiate the class and then call the methods, but you’d have to do that for each method. Depending on your language, sometimes it’s not even clear what all the methods of a given object are, let alone the source code that created them.

Currently, in order to reuse a piece of code, you have to pull it out (of context) and name it so that the 2 places you want to use it can refer to it (whether it’s a class, a function, a variable, etc.). But pulling something out and having 2 things be semantically linked are separate things. The fact that our tools do not allow us to do one without the other (when sometimes I really do only want one) is an indication of inadequate tools.

There is a general pattern of solving programming problems with an extra level of indirection. However, too much indirection creates more problems, especially when working with code you aren’t familiar with. Or put another way, when someone else works with code you wrote. It is a limitation of the human mind that we must take into consideration.

Programs are written by people, and they must also be read by people. For practical reasons, simplicity and clarity — which amount to human readability — are the first things to be sacrificed when the only benchmark for code being shipped is whether it executes correctly. Like rushing the composition of an essay, which results in prose that can be interpreted by a reader but is not necessarily well-written, code that is rushed is often similarly poorly-written.

It has been said that programs should be written primarily for people, and secondarily for computers to execute. The more I learn and the more I experience, the more I agree with this. And our tools should help this cause.

blog comments powered by Disqus