Evaluation Example

General Introduction

The primary purpose of the review is to critically evaluate the quality of the work in question. For a class position paper, this means subjectively judging if a paper meets the criteria as outlined below. The secondary purpose of the review is to give the author feedback as to how to improve the paper. Thus, simply answering yes/no to the list of questions in the criteria is not sufficient; the reviewer must give reasons as to why she thinks the paper does, or does not, meet the criteria.

The review below is for paper #6, from the first set of position papers. .

Overall Classification:

Very Good

Main Evaluation:

The Position:

· Is the position well defined?

· Is the issue one with genuine controversy and uncertainty?

· Is the issue narrow enough to be manageable for a class paper?

· Is the position quantifiable? That is, put in numerical terms, if possible?

The position is that so called "Heisenbugs" do exist. The author first introduces Heisenbugs as distinct classes using Jim Gray's definitions. The author then sets out the counter position, which states that if "everything is the same" then by definition, a bug should be repeatable, and thus a Bohrbug.. The position is not quantifiable except in terms of existence, so the paper need only define Heisenbugs and show they do, in fact exist. The position is narrow enough for a class paper. Reams of supporting evidence or advanced experiments are not needed to support the position.

Countering the Opposition:

· Are the communities of people involved with the position (and their positions) identified?

· Are the opposing positions articulated? Are rebuttals given to the opposing positions?

No community is specifically identified. But the author does a good job articulating the opposing position in Section 4. The crux of countering the counter-argument is defining what has to be "the same" for a program to run the same way. For many programs, both the environment and the program state must be the same, the paper argues, in order for a program to behave the same way every time. The author's rebuttal then explains in detail why the environment can never be the "same" between two runs of a program, giving specific examples using disk drives, the network, and clock drift.

Evidence:

· What evidence is used to support the position?

· Evidence based on experimentation?

· General facts about the systems in question?

· Anecdotes only?

Specific examples of non-deterministic behavior are cited, including as said above the disk drives, clocks and network. The author then goes on to give specific examples of how these uncertain behaving objects impact the rest of the system. Although the author does provide this linkage, I think the paper does make the case strongly enough. Clearly, if a program never has significant interaction with these uncertain processes, then it should never have Heisenbug like behavior. This might be on reason why certain communities strongly believe in Heisenbugs, since their programs interact with disks, clocks and networks, and other communities whose programs does not interact with these processes, are rather skeptical of the whole idea of Heisenbugs.

A second problem is that the author makes claims about the non-determinism of the disk, network and clocks without citing any previous works. While such claims seem self-evident, some citations measuring the quantities would strengthen these claims.

I don't think the entropy analogy in Section 5 is appropriate. Reasoning by analogy is fraught with peril, because the reasoning about the original system, in the case of entropy, heat exchange, might not apply to the analogous system, in this case a computer. However, the author recovers by putting a quantifying the analogy by claiming entropy is a measure of the probability of being in a given state. This metric seems to be tractable to measure in som contexts. More importantly, from the Heisenbug perspective, one measure of how non-deterministic a program could be the probability of being in multiple state given the same input.

Organization:

· Is the paper logically organized?

· Is it easy to follow the position, counter-arguments, and evidence?

· Are there transitions between sections?

· Are a consistent writing style and tone used throughout?

· Is the vocabulary correct and conforming to standard practices?

· Are the grammar and spelling correct?

· Is a consistent tense used throughout?

· Finally, is the paper easy for the reader to follow and understand?

The paper is well organized. The author defines the position, gives some background on it, and then defines the counter-position. Next, the author introduces examples which counter the counter-position. Finally, the author concludes. I found the paper easy to read and follow.

Final Analysis:

Would a skeptic be convinced, or at least swayed, to the position in the paper? Why or why not?

I think a skeptic would be partially convinced, or at least convinced in the application areas where people care about these bugs. Based on this paper, I think the claim could still be made that programs can remain free from Heisenbugs if the program environment does not posses any non-deterministic behaviour. For example, the Unix text tools such as grep, awk and sed, have no "non-deterministic" environmental inputs and so should not exhibit any Heisenbug behaviour. However, for any programs influenced by these random processes, I think even a skeptic would have to concede the author's point about Heisenbugs.

Comments to the Author:

The paper is quite good overall. I think the paper could be made stronger by more forcefully connecting the ideas of (1) fundamental non-determinism (2) program environment vs. program state , (3) program behavior, and (4) bugs. A diagram might be one way to do this.

The introduction might get to the point a little faster, perhaps by introducing the counter-argument right away.