Created a bot for the 's
I'm letting it trade uninterrupted for a while to see how it gets on.
What's a sensible number of markets to enable meaningful evaluation of things like:
- Tuesday vs. Sunday
- Morning vs. Evening
- Newcastle vs. Doncaster
- Open Races vs. Handicaps
Cheers
How much data before eliminating markets?
I think it depends on the strategy and to be honest I don’t think any of those variables you have listed are valid for filtering markets. I know a lot of people do filter on venue / race types but unless you have a reason as to why a strategy shouldn’t / should work you are just over fitting.
The issue is that it is so easy to look at the pnl and start filtering, before you know it you have a profitable strategy, the issue is that 99% of the time you are just over fitting.
An example I had recently was that I was finding that different courses required slightly different parameters for optimum profit, after a lot of trial and error I worked out that it was actually to do with another variable (course specific) Factoring this in I was able to continue betting on every course with the same parameters, with this all being a numbers game the more you can turnover the better!
The issue is that it is so easy to look at the pnl and start filtering, before you know it you have a profitable strategy, the issue is that 99% of the time you are just over fitting.
An example I had recently was that I was finding that different courses required slightly different parameters for optimum profit, after a lot of trial and error I worked out that it was actually to do with another variable (course specific) Factoring this in I was able to continue betting on every course with the same parameters, with this all being a numbers game the more you can turnover the better!
- ruthlessimon
- Posts: 2094
- Joined: Wed Mar 23, 2016 3:54 pm
The generic response will be (3mths+) in sample, then 1mth (20%/30%) out sample. If that passes (which is a tough test to pass!); then it goes practice mode (debatable step), then live small.
As a side, I'm uncomfortable with discrete variables being used for optimisation - because I personally like seeing how "minor changes" affect profitability. We can't do that if we use days/race type/course (although arguably we can, if they can be logically grouped)
As a side, I'm uncomfortable with discrete variables being used for optimisation - because I personally like seeing how "minor changes" affect profitability. We can't do that if we use days/race type/course (although arguably we can, if they can be logically grouped)
- ShaunWhite
- Posts: 9731
- Joined: Sat Sep 03, 2016 3:42 am
This is one of those questions where if anyone here had a decent stats qualification, they could just give you a formula. I've seen them but I don't understand them
It depends on the strike rate really. To explain what I mean, take an extreme case.....if your bot backs 100/1 shots at 120/1 then sampling 300 markets won't tell you anything. If it's a coin flip, then 300 is a reasonable start. I'd guess a sample which includes 300 of the lesser likely outcome would be a fair number.
BUT...I have to concur with Linus and Simon about the dangers of overfitting. On the positive side you're gathering data using real money (I guess) so it's as reliable as you can get, assuming you're not using microstakes and encountering rounding issues with commission etc.
It depends on the strike rate really. To explain what I mean, take an extreme case.....if your bot backs 100/1 shots at 120/1 then sampling 300 markets won't tell you anything. If it's a coin flip, then 300 is a reasonable start. I'd guess a sample which includes 300 of the lesser likely outcome would be a fair number.
BUT...I have to concur with Linus and Simon about the dangers of overfitting. On the positive side you're gathering data using real money (I guess) so it's as reliable as you can get, assuming you're not using microstakes and encountering rounding issues with commission etc.
Your advice is appreciated, Linus. I can't currently think of any reason to filter out markets however does comparing courses not highlight room for optimisation as you mention? Also, if I feel that tweaking a parameter may be beneficial, am I expected to run multiple 'beta' instances of BA in practice mode alongside my live account and compare or is there some better way of going about things?LinusP wrote: ↑Mon Feb 11, 2019 7:24 pmI think it depends on the strategy and to be honest I don’t think any of those variables you have listed are valid for filtering markets. I know a lot of people do filter on venue / race types but unless you have a reason as to why a strategy shouldn’t / should work you are just over fitting.
The issue is that it is so easy to look at the pnl and start filtering, before you know it you have a profitable strategy, the issue is that 99% of the time you are just over fitting.
An example I had recently was that I was finding that different courses required slightly different parameters for optimum profit, after a lot of trial and error I worked out that it was actually to do with another variable (course specific) Factoring this in I was able to continue betting on every course with the same parameters, with this all being a numbers game the more you can turnover the better!
Thank you. The automation is currently firing in around 30 markets / day if that helps.ruthlessimon wrote: ↑Mon Feb 11, 2019 7:30 pmThe generic response will be (3mths+) in sample, then 1mth (20%/30%) out sample. If that passes (which is a tough test to pass!); then it goes practice mode (debatable step), then live small.
As a side, I'm uncomfortable with discrete variables being used for optimisation - because I personally like seeing how "minor changes" affect profitability. We can't do that if we use days/race type/course (although arguably we can, if they can be logically grouped)
For some backstory I was sat on the loo today considering how the more you flip a coin, the closer your results will become to the true expectancy (0.5). I figure if your automation performs poorly in one market, this will become apparent as the number of markets traded rises. Or is that not relevant here...?
Mmmm - assuming a fair coin returning a purely random result, the EV of an infinite number of flips is generally accepted as being 50%
However, that does not mean (say) 300 heads in a row won't happen - presuming reversion to EV and that a tail must occur is what Gambler's Fallacy is all about https://en.wikipedia.org/wiki/Gambler%27s_fallacy
Current thinking (eg Ole Peters et al) is that these rare events do seem to occur more frequently than implied by odds etc and that means some people will be stuck on the downside of the event and unable to recover their position in their lifetime (eg 1930's, 2008 etc crashes)
Same goes for trading - no matter what the stats say has happened in the past the whole strategy may collapse with the next race if the strategy is based on selective random outcomes. If the strategy has a solid reason to work that you can define and understand then you have a genuine edge and should clean up until things change and the edge disappears - eg starting position at track x wins more than odds imply - edge wiped out if the rails get moved and the angles flattened / bookies wise up and shorten their odds (if they didn't know already!) / etc
If only it were so simple as it seems in the loo
- ruthlessimon
- Posts: 2094
- Joined: Wed Mar 23, 2016 3:54 pm
This is why you should heed the following from Xitian - because you've just spotted why Peter's incorrect (or his position needs clarifying; cos I still don't get his viewpoint )
I can't think of anything worse, than to realise 6mths in; a new variable is needed - which can't be tested because we went live too early - & the data is dirty/biased to our original thought process - meaning it cannot be reused. Far better to have a malleable trading/backtesting sandbox, with (ideally, although impractical/difficult) every variable
xitian wrote: ↑Tue Nov 20, 2018 1:15 pmWriting a backtesting simulation system might take a couple months depending how much time you spend on it? Imagine how many years you could use it in future though, and how many ideas you can trial. Just make sure you keep some out of sample data, and make sure you know what assumptions you’re making when you backtest/simulate.
That's right. Interestingly though if you've already flipped 100k times, such a streak would only shift the expectancy for heads by 0.15%.foxwood wrote: ↑Mon Feb 11, 2019 9:27 pmMmmm - assuming a fair coin returning a purely random result, the EV of an infinite number of flips is generally accepted as being 50%
However, that does not mean (say) 300 heads in a row won't happen - presuming reversion to EV and that a tail must occur is what Gambler's Fallacy is all about https://en.wikipedia.org/wiki/Gambler%27s_fallacy
It seems sensible to assume your results have incurred a high level of randomness initially, but surely you only need to go so far until you're probably close enough to start justifying changes. If things seem drastically wrong in a particular area, I imagine I'll adjust much sooner than if they only seem moderately wrong.
I think we can all agree it would be too soon to tweak things after 1 market, and ineffective to tweak things after 50k markets. Ultimately the chips will fall where they may but you must admit it's entertaining to wonder where might be optimal...
- ruthlessimon
- Posts: 2094
- Joined: Wed Mar 23, 2016 3:54 pm
Debatable; because "exemplar markets" can yield edges we haven't even contemplated
A 50tick lay loss on a single race (trading the current strategy); only to realise in hindsight the market had already drifted 50ticks - "I wonder what would've happened, long term, if we had backed that pattern?"
-
- Posts: 3140
- Joined: Sun Jan 31, 2010 8:06 pm
eightbo wrote: ↑Mon Feb 11, 2019 6:46 pmCreated a bot for the 's
I'm letting it trade uninterrupted for a while to see how it gets on.
What's a sensible number of markets to enable meaningful evaluation of things like:
- Tuesday vs. Sunday
- Morning vs. Evening
- Newcastle vs. Doncaster
- Open Races vs. Handicaps
Cheers
Seems to me you'll be backfitting your data no matter how many markets you let it run with those type of variables.
That's too deep even for meruthlessimon wrote: ↑Mon Feb 11, 2019 11:09 pmDebatable; because "exemplar markets" can yield edges we haven't even contemplated
A 50tick lay loss on a single race (trading the current strategy); only to realise in hindsight the market had already drifted 50ticks - "I wonder what would've happened, long term, if we had backed that pattern?"
p.s. @MemphisFlash I see you. Let's get down to business I know that youuuuuuuuuuuuuuu, you've got what I neeeeeeeeeeed
Last edited by eightbo on Mon Feb 11, 2019 11:53 pm, edited 1 time in total.
Hi there. Thank you for your input. LinusP has already mentioned this.spreadbetting wrote: ↑Mon Feb 11, 2019 11:36 pmSeems to me you'll be backfitting your data no matter how many markets you let it run with those type of variables.
Kindly ignore these variables and replace them in your head with those you deem useful. At what point is sensible to begin adjusting iyo?
- ruthlessimon
- Posts: 2094
- Joined: Wed Mar 23, 2016 3:54 pm