Wednesday, June 3, 2009

Non-Deterministic Programs

Most (if not all) massively parallel computer programs are non-deterministic. Normally, this is due to the asynchronous nature of the parallel execution paths, but it is also caused by recursion or recurrence in the execution graph.

This brings me to a point of contention, not just a software constraint waiting for resources, but users waiting on a non-deterministic execution. Massively parallel programs (MPP) are normally breathtakingly fast. The challenge I'm facing is the better MPP "normally" performs, the shorter the user's expectation on the time it should perform. The answer is easily solvable (has an answer) for deterministic problems, but consists a long-tailed probability distribution for the more complex cases.

How long should a user expect to wait for a non-deterministic program to execute?

It helps to know this is not a new problem. See: Weinberg and Hoag, 1957 http://blogs.msdn.com/micahel/archive/2009/03/06/GeraldMWeinbergsFavoriteBug.aspx
The challenge remains however of estimating the probabilities of for discovering solutions at given some amount of time. This begs the question - What are solutions?

Turning the problem on itself, the first challenge is separating non-deterministic problems from their analytical cousins. Normally, simple matrix analysis (LU, Cholesky, SVD) helps categorize these problems. For non-deterministic problems, allow the user to specify limits to fractal or recursive depth, then calculate the probabilities within these limits.

Finally, the lesson from Weinberg is: Show that the program is indeed working, and making some progress toward the goal, not just locked in a loop or forever in contention. Since I don't recall open-ended progression indicators in my toolbox, there needs to be mileposts to measure the fitness of a solution. While these too are open-ended, let the user decide. A graphical / network-topological representation of the problem/solution space helps the user understand the nature of the problem. Seeing information about the problem/solution might help in recasting the problem in humanly understandible terms.