Hi Gang,
I'm going to do a longer Buttonwood_v5 post sometime soon, but wanted to respond in the meantime to some of this before I forget...
DC: check help - wave59 help, and look for "planet colors". There's a list there.
Regarding optimization/curve-fitting/etc, yes, that's definitely an issue with any mecahnical system, and is one of the big challenges when building this sort of thing. We need to select various parameters, but we also need to be careful that in doing so we aren't curve-fitting the model to past data. I've seen models with only a couple parameters that were curve fit and blew up moving into the future, but I've also seen models with lots and lots of parameters that continued working for quite a long time. So there's no way to know ahead of time if your system will continue to perform. But that's not different from any other kind of trading either - no one knows how long their method will continue to perform, or whether the drawdown experienced in the future will be much larger than the one experienced in the past, etc. Uncertainty is part of this game, and we accept the risk of uncertainty in exchange for the opportunity to make profit.
My thoughts on how to deal with test sets, etc, are a little different from what you'll find in the public domain, and I spent a good deal of time ranting about that in MTS. The issue we have is that most of the information available about how to create trading systems isn't written by people who actually know how to create trading systems. There are not credentials in this industry that you need to possess in order to write a book aside from the willingness to write one. Because of that, I've come across system development material that is not only wrong, but completely dangerous.
For example, one well known author back in my earlier days wrote a book about system development and spent a lot of time discussing analyzing MFE and MAE (maximum favorable/adverse excursion), which is basically a plot of how far losing trades go in your favor before they turn around, as well as how far winning trades go against you before they turn profitable. The gist of what he was doing was to add to a losing position at that magic number when the winners would go against him, using that value to place stops, and taking profits at whatever level let him turn some losers into winners. In other words, slice and dice the data very fine, then use dollar-based stops and targets in order to fake the basic statistics of the system out. It made crappy systems look good, but was complete curve-fitting, and I
guarantee that if you add to losers in real life trading where the outcomes aren't known, you will eventually find the streak that works just right to blow the account out of the water. That author (not surprisingly) sold a suite of software to do MAE/MFE analysis at the time, so I get why he was into it. This is a good example of curve-fitting: trying to set your parameters (stops and targets) exactly right so that a system makes money when it wouldn't otherwise.
Anyway, this is just one example of many that tell you why you need to be careful when reading books about this sort of thing, and to really spend some time thinking and doing your own testing to prove everything out. If you build 20 systems that all average losers, you will eventually learn that it's a REALLY bad idea, and you will know exactly why. But I wonder how many bright-eyed newbie traders got all excited about that book, didn't test the material properly, then averaged themselves down out of their accounts. There's a picture of Paul Tudor Jones that I really like, where he's sitting at his desk and you can see a note on the wall that says "Losers average losers".
Coming to the idea of breaking data up into test/training/unseen sets, doing walk-forward optimizations, etc, that's all the same thing. Developers noticed that they were having problems with optimization, so they devised all these tricks to make sure that the optimizations that they used worked "better" than just running them on the whole set. Then they went and wrote books and articles about it. No matter how you chop your data set up, and how many passes you make on it, and how many chunks you hold back, etc, as long as you eventually use all the data in order to make a decision on whether or not a system is good, you are curve-fitting in the same way as if you just ran your system report over the entire data set one time. The only difference is that you have made the process complicated enough to trick yourself into thinking that you aren't curve-fitting. A system built in that way has no better chance of success than the one built using just one pass through the entire data set, despite the fact that we've all been told that it's "better" to do it that way.
What is more important than splitting data up, etc, is to consider the following points:
1) How many trades do we have?
2) How many degrees of freedom does the model have?
3) How do the results look for the system when looked at in aggregate, rather than just at the "special" settings?
We want lots of trades (2000 is really good), rather than fewer ones. We also want fewer degrees of freedom. Think of each parameter as a degree of freedom. We really don't want more than 10-12 of those. In this system, we've got natal dates, corrections to the natal dates, orbs, thresholds, and three average lengths. If you consider natal dates as one parameter, then that's about 8 degrees of freedom. So we're still OK, but we have to be careful not to add too many other moving parts or we're going to get in some trouble.
Finally, and most importantly, if we vary the parameters around, how does that change profitability? Do we have a whole bunch of parameters that all make money, or do we just have a narrow band of parameters that make money? That's the kicker there. If you've got 1000 different parameter sets that you could choose from, but they ALL make money, then your system is most likely going to work. If you've only got 50 different possible combinations, but they all lose money except for one special setting, then that system is most likely going to blow up. The point of optimizing is to get a feel for the range of what is going to work, not to find the one perfect parameter setting. So for that reason, it's good to look at big spreads like orionsbelt did. We just need to make sure that we're doing it with the right mindset, and not just to find the number that makes this all look good.
I haven't looked at the results deeply enough yet to be able to comment on them, so I'll save that for later. I also apologize for the length of this post - kind of ended up with a life of its own...
More later,
Earik