There are interesting ways of categorically mapping the structure, behavior and evolution (morphisms) of dynamical systems ala Steven Strogatz in "Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry and Engineering", by Perseus Publishing. This has been detailed by Category Theory (CT - Eilenberg, Mac Lane, Lawvere, among many more) and the use of functors - a programmatic analog affecting state evolutions.
Rather than the set-theoretic approach, with it's limitations of First Order Logic, CT provides a more accurate logical / analytical framework in which to understand complex, dynamical systems, and categorize (or classify) their patterns of change.
Curious. This particular technique seems grossly under-utilized in Complex Systems Theory, and in understanding complex, dynamical systems, their processes, growth and evolution. Furthermore, it seems that these known techniques are actually ignored.
If someone has a good explanation for this phenomenon, I'd sure like to hear it - or am I not looking for exemplars in the right places?
Sunday, November 1, 2009
Tuesday, August 25, 2009
Parallel Paths
First, for all my life I have been a champion of many educational issues. Education is far too important to just to be left in the hands of educators. It is not a commodity. By this I mean, education is both a community AND individual personal responsibility.
The processes of education runs in parallel with processes of training. These are very different functions, although these functions are often confused. Often, training and education are in conflict.
Which brings me to the point of this posting: In software engineering, we are trained to view software either from a serial perspective, or a parallel perspective, or maybe some combination of the two. It is obvious that we can run several serial tasks in parallel, and that parallel paths often consist of individual serial task lanes. Nice, neat, very deterministic.
We are taught to rollout loops and partition data for parallel processing - and that is good thing to learn. But I wonder, is the way we are trained to think about computer science and software engineering - in exclusive terms of serial and parallel fabrics - preventing us from understanding the value other computational fabrics? In other words, has our training overly constrained the purposes of our education?
Many of the systems I deal with exhibit a property called complexity. Often this is considered a "problem" - to be avoided or simplified, rather than considered offering an excellent solution strategy. From a computational point of view, complex networks offer extremely efficient encoding capabilities, and sometimes (but, not always) extremely efficient computational capabilities.
In putting together a proposal for a VLS compute cluster, I see how dissonance between training and education affects the conceptualization of computation on clusters of many highly trained professionals. Is this an educational opportunity? Or does training rule the day?
Stay tuned ...
The processes of education runs in parallel with processes of training. These are very different functions, although these functions are often confused. Often, training and education are in conflict.
Which brings me to the point of this posting: In software engineering, we are trained to view software either from a serial perspective, or a parallel perspective, or maybe some combination of the two. It is obvious that we can run several serial tasks in parallel, and that parallel paths often consist of individual serial task lanes. Nice, neat, very deterministic.
We are taught to rollout loops and partition data for parallel processing - and that is good thing to learn. But I wonder, is the way we are trained to think about computer science and software engineering - in exclusive terms of serial and parallel fabrics - preventing us from understanding the value other computational fabrics? In other words, has our training overly constrained the purposes of our education?
Many of the systems I deal with exhibit a property called complexity. Often this is considered a "problem" - to be avoided or simplified, rather than considered offering an excellent solution strategy. From a computational point of view, complex networks offer extremely efficient encoding capabilities, and sometimes (but, not always) extremely efficient computational capabilities.
In putting together a proposal for a VLS compute cluster, I see how dissonance between training and education affects the conceptualization of computation on clusters of many highly trained professionals. Is this an educational opportunity? Or does training rule the day?
Stay tuned ...
Monday, August 10, 2009
Models of Complexity
OK, I apologize for the implication that models are a paradigm of complexity.
The issue I've been discussing for the past several days with several colleagues is: "What do you mean by a model?" This stems from the fact that the term "model" means something different in science (esp. physics), math and engineering.
The reason for the problems stem from the fact that each domain has excellent reasons to adopting their unique definition. Each definition is right in some aspect, yet each is wrong. I have proposed a definition of a model extending from mathematical concepts. Everyone jumped on me for being "too theoretical". And here I thought I had some very practical, easy-to-use reasons for the math-centric foundation that allowed it to extend easily and map across the domains.
That's what I get for thinking ...
The issue I've been discussing for the past several days with several colleagues is: "What do you mean by a model?" This stems from the fact that the term "model" means something different in science (esp. physics), math and engineering.
The reason for the problems stem from the fact that each domain has excellent reasons to adopting their unique definition. Each definition is right in some aspect, yet each is wrong. I have proposed a definition of a model extending from mathematical concepts. Everyone jumped on me for being "too theoretical". And here I thought I had some very practical, easy-to-use reasons for the math-centric foundation that allowed it to extend easily and map across the domains.
That's what I get for thinking ...
Monday, July 13, 2009
Complexity is Hard Enough
... Don't make it harder by mis-understanding it.
Some colleagues and I have been discussing Bayesian reasoning to refine, and hopefully validate models of systems that exhibit complexity. This lead to the discussion of Taleb's "Black Swans", and ideas about how these may be generated, identified and simulated on a computer. I offer this link to Eliezer Yudkowsky's work that provides an excellent context for starting the discussion.
http://www.singinst.org/upload/cognitive-biases.pdf
It seems that one of the hallmarks of complexity is self-reference between meta-scales.
Some colleagues and I have been discussing Bayesian reasoning to refine, and hopefully validate models of systems that exhibit complexity. This lead to the discussion of Taleb's "Black Swans", and ideas about how these may be generated, identified and simulated on a computer. I offer this link to Eliezer Yudkowsky's work that provides an excellent context for starting the discussion.
http://www.singinst.org/upload/cognitive-biases.pdf
It seems that one of the hallmarks of complexity is self-reference between meta-scales.
Wednesday, June 3, 2009
Non-Deterministic Programs
Most (if not all) massively parallel computer programs are non-deterministic. Normally, this is due to the asynchronous nature of the parallel execution paths, but it is also caused by recursion or recurrence in the execution graph.
This brings me to a point of contention, not just a software constraint waiting for resources, but users waiting on a non-deterministic execution. Massively parallel programs (MPP) are normally breathtakingly fast. The challenge I'm facing is the better MPP "normally" performs, the shorter the user's expectation on the time it should perform. The answer is easily solvable (has an answer) for deterministic problems, but consists a long-tailed probability distribution for the more complex cases.
How long should a user expect to wait for a non-deterministic program to execute?
It helps to know this is not a new problem. See: Weinberg and Hoag, 1957 http://blogs.msdn.com/micahel/archive/2009/03/06/GeraldMWeinbergsFavoriteBug.aspx
The challenge remains however of estimating the probabilities of for discovering solutions at given some amount of time. This begs the question - What are solutions?
Turning the problem on itself, the first challenge is separating non-deterministic problems from their analytical cousins. Normally, simple matrix analysis (LU, Cholesky, SVD) helps categorize these problems. For non-deterministic problems, allow the user to specify limits to fractal or recursive depth, then calculate the probabilities within these limits.
Finally, the lesson from Weinberg is: Show that the program is indeed working, and making some progress toward the goal, not just locked in a loop or forever in contention. Since I don't recall open-ended progression indicators in my toolbox, there needs to be mileposts to measure the fitness of a solution. While these too are open-ended, let the user decide. A graphical / network-topological representation of the problem/solution space helps the user understand the nature of the problem. Seeing information about the problem/solution might help in recasting the problem in humanly understandible terms.
This brings me to a point of contention, not just a software constraint waiting for resources, but users waiting on a non-deterministic execution. Massively parallel programs (MPP) are normally breathtakingly fast. The challenge I'm facing is the better MPP "normally" performs, the shorter the user's expectation on the time it should perform. The answer is easily solvable (has an answer) for deterministic problems, but consists a long-tailed probability distribution for the more complex cases.
How long should a user expect to wait for a non-deterministic program to execute?
It helps to know this is not a new problem. See: Weinberg and Hoag, 1957 http://blogs.msdn.com/micahel/archive/2009/03/06/GeraldMWeinbergsFavoriteBug.aspx
The challenge remains however of estimating the probabilities of for discovering solutions at given some amount of time. This begs the question - What are solutions?
Turning the problem on itself, the first challenge is separating non-deterministic problems from their analytical cousins. Normally, simple matrix analysis (LU, Cholesky, SVD) helps categorize these problems. For non-deterministic problems, allow the user to specify limits to fractal or recursive depth, then calculate the probabilities within these limits.
Finally, the lesson from Weinberg is: Show that the program is indeed working, and making some progress toward the goal, not just locked in a loop or forever in contention. Since I don't recall open-ended progression indicators in my toolbox, there needs to be mileposts to measure the fitness of a solution. While these too are open-ended, let the user decide. A graphical / network-topological representation of the problem/solution space helps the user understand the nature of the problem. Seeing information about the problem/solution might help in recasting the problem in humanly understandible terms.
Monday, May 11, 2009
HyperNEAT for CUDA
It's time for me to give back to the Neuro-Evolution for Augmenting Topologies (NEAT) community. I am starting to port strict HyperNEAT to a stand-alone CUDA implementation. For those who don't know, CUDA is NVidia's Compute Unified Device Architecture.
Unless I hear otherwise, it will be a simple implementation for 1 multi-core CPU (of say 2 on-die cores) and 1 CUDA capable GPU (say a GTX 8800 or a GTX 295). Nothing too fancy, just enough to give insight into how I would implement HyperNEAT on CUDA. The way I'm currently doing this may be novel - I'm using a factory class to generate the actual CUDA code, and I probably should offer my experiment development GUI to help others understand this "magic smoke" a little better.
Unless I hear otherwise, it will be a simple implementation for 1 multi-core CPU (of say 2 on-die cores) and 1 CUDA capable GPU (say a GTX 8800 or a GTX 295). Nothing too fancy, just enough to give insight into how I would implement HyperNEAT on CUDA. The way I'm currently doing this may be novel - I'm using a factory class to generate the actual CUDA code, and I probably should offer my experiment development GUI to help others understand this "magic smoke" a little better.
Watt is a new species?
As evolutionary algorithms evolve (apparently a quasi-Lemarkian form of evolution where what you learn by using evolutionary algorithms end up affecting the evolutionary algorithm itself) they look less and less like their ancestors. They soon become a new species, not just the next generation of their parents.
(BTW, AIP's Inside Science News Service has a Mother's Day article on epigenetics that is interesting - but way beyond my pay grade.)
I just did a quick comparison of WattsNEAT with Jason Gauci's version of HyperNEAT. Did these two exemplars come from the same solar system, or even galaxy? Could have fooled me.
Here are the only observable common traits. They both go through cycles of complexification and pruning. The way they go through these cycles has transmogrified into something unrecognizable, unless you consider alleles a form of CPPN (which IMO they are).
I wonder if recursion, which is a scale form of recurrence, converts source code into their fractal representation and back while you're not looking, or is it just entropy, crossover and mutation? I ask because the crossover and mutation functions really look different now.
(BTW, AIP's Inside Science News Service has a Mother's Day article on epigenetics that is interesting - but way beyond my pay grade.)
I just did a quick comparison of WattsNEAT with Jason Gauci's version of HyperNEAT. Did these two exemplars come from the same solar system, or even galaxy? Could have fooled me.
Here are the only observable common traits. They both go through cycles of complexification and pruning. The way they go through these cycles has transmogrified into something unrecognizable, unless you consider alleles a form of CPPN (which IMO they are).
I wonder if recursion, which is a scale form of recurrence, converts source code into their fractal representation and back while you're not looking, or is it just entropy, crossover and mutation? I ask because the crossover and mutation functions really look different now.
Saturday, May 9, 2009
SwarmFest 2009, June 28 - 30 in Santa Fe
SwarmFest is the annual International conference for Agent Based Modeling and Simulation - IMO, a necessary skillset needed to understand complexity in systems. I just received notification that I have been accepted to present at this year's conference, and I am excited.
Why is this important to me?
My presentation details recent findings I've made regarding using a form of Swarm in which the schedule and rules for the agents are not hard-coded in the experiment. The our case, agents "discover" the rules and patterns based upon whatever pre-existing structure and connection it finds. This provides part of the solution space called a context. The discovery mechanism that searches the context for solutions is a topology and weight evolving artificial neural network, originally developed by Ken Stanley and Risto Miikkulainen at the University of Texas, called NEAT. (NEAT is actually the great-grandfather. WattsNEAT is our N-th generation implementation of HyperNEAT).
Now for the cool part. In order to configure the compute fabric for the many expriment contexts we might create in which to evolve solutions to problem experiments, we utilize WattsNEAT to evolve configurations of the compute fabric itself. This is our solution to the problem of partitioning data and structures for massively parallel computation.
If you have followed along so far, we have just used WattsNEAT to configure a compute fabric in which to effectively and efficiently run WattsNEAT in massively parallel compute fabrics. For those of you who have used Torque or SGE rolls in a Rocks cluster, the resulting advantage is an intelligent distribution and scheduling agent that configures the compute fabric according to the context (schedules, priorities, and component configurations it discovers at the time).
This is not as static as it may initially appear.
I'm moving to get the specifics (of which there are many) down and put (at least) into a provisional patent. After that, we will decide the bifurcation strategy between proprietary paths and open source.
Why is this important to me?
My presentation details recent findings I've made regarding using a form of Swarm in which the schedule and rules for the agents are not hard-coded in the experiment. The our case, agents "discover" the rules and patterns based upon whatever pre-existing structure and connection it finds. This provides part of the solution space called a context. The discovery mechanism that searches the context for solutions is a topology and weight evolving artificial neural network, originally developed by Ken Stanley and Risto Miikkulainen at the University of Texas, called NEAT. (NEAT is actually the great-grandfather. WattsNEAT is our N-th generation implementation of HyperNEAT).
Now for the cool part. In order to configure the compute fabric for the many expriment contexts we might create in which to evolve solutions to problem experiments, we utilize WattsNEAT to evolve configurations of the compute fabric itself. This is our solution to the problem of partitioning data and structures for massively parallel computation.
If you have followed along so far, we have just used WattsNEAT to configure a compute fabric in which to effectively and efficiently run WattsNEAT in massively parallel compute fabrics. For those of you who have used Torque or SGE rolls in a Rocks cluster, the resulting advantage is an intelligent distribution and scheduling agent that configures the compute fabric according to the context (schedules, priorities, and component configurations it discovers at the time).
This is not as static as it may initially appear.
I'm moving to get the specifics (of which there are many) down and put (at least) into a provisional patent. After that, we will decide the bifurcation strategy between proprietary paths and open source.
Thursday, April 30, 2009
NIMD - A new parallel compute formalism
We have been working with configurations for applications running on hybrid, heterogeneous compute clusters. Ours started out being a plain vanilla Rocks Cluster using CUDA Rolls.
The challenge in developing massively parallel computer applications centers around the way in which data and tasks are partitioned. Specifically, these partitioning decisions are closely coupled with both the internal and external message channels within and between the cluster components. We at Watt's Advanced Research Projects have found that an adaptive approach to the computational contexts works best for us. We have developed an "intelligent distributor" that can discover the context of the compute cluster - the schedules, priorities, resources, utilizations and configurations - using evolutionary neural networks to "reconfigure" the compute fabric and making efficient and effective use of the cluster's context.
We have termed this compute fabric NIMD (for networked instruction, multiple data). It differs from traditional MIMD in the fact the architecture is non-hierarchical, and more specifically can be recurrent. The additional complexity is not problematic, but provides an ensemble approach to solution space for the compute fabric. Thinking back, even hierarchical parallelism schemes are non-deterministic to some degree. We seem to make better use of that fact.
The challenge in developing massively parallel computer applications centers around the way in which data and tasks are partitioned. Specifically, these partitioning decisions are closely coupled with both the internal and external message channels within and between the cluster components. We at Watt's Advanced Research Projects have found that an adaptive approach to the computational contexts works best for us. We have developed an "intelligent distributor" that can discover the context of the compute cluster - the schedules, priorities, resources, utilizations and configurations - using evolutionary neural networks to "reconfigure" the compute fabric and making efficient and effective use of the cluster's context.
We have termed this compute fabric NIMD (for networked instruction, multiple data). It differs from traditional MIMD in the fact the architecture is non-hierarchical, and more specifically can be recurrent. The additional complexity is not problematic, but provides an ensemble approach to solution space for the compute fabric. Thinking back, even hierarchical parallelism schemes are non-deterministic to some degree. We seem to make better use of that fact.
Saturday, April 11, 2009
Mod / Sim on Rocks CUDA Cluster
We have been experimenting with modeling and simulation of complex system on our Rocks / CUDA cluster - GenCluster - in Albuquerque, NM USA.
The challenge is designing proper partitions for data and tasks to take advantange of the massively parallel processing - which it seems is a prime candidate for the evolutionary neural networks we run on the cluster. If you are running a Rocks / CUDA cluster, I'd like to hear from you, and perhaps we can share any non-confidential information on the simulation of systems complexity.
The challenge is designing proper partitions for data and tasks to take advantange of the massively parallel processing - which it seems is a prime candidate for the evolutionary neural networks we run on the cluster. If you are running a Rocks / CUDA cluster, I'd like to hear from you, and perhaps we can share any non-confidential information on the simulation of systems complexity.
Subscribe to:
Posts (Atom)