Strategy Development: Modelling

LinusP · Sat Jan 06, 2018 10:57 am

ShaunWhite wrote: ↑
Sat Jan 06, 2018 2:17 am

LinusP wrote: ↑
Fri Jan 05, 2018 9:14 pm
Interesting, what does 2017 look like? viewtopic.php?f=54&t=13157&p=141407#p141407

Relavent article just posted by Buchdahl: https://www.pinnacle.com/en/betting-art ... 3vjy59y4q3
Are you trying to kill me? I'd planned an early night and you send me a link to data and something to read. Thanks a lot.

Edit....I've got a feeling that this subject might start to head in this direction ...

If we set about devising a betting system via data dredging until we find criteria that are profitable, we risk failing to establish causal explanations for what we find (Buchdahl)

You can sleep when you have your if statements making you money

A lot of long time members (including Mr Webb) talk about using data to confirm an idea rather than looking for patterns, if you look hard enough you will find one. Fooled by randomness is a great book on this topic, I got a copy for £2 off amazon.

Euler · Sat Jan 06, 2018 2:11 pm

Data mining can be statistical deja vu.

When I come up with an idea I will often get somebody else to look at the data and not tell them what they are looking for. That way I can arrive at the same solution by two methods.

But that is only the first step as I then need to describe what is happening and why. Only then can I really start really trying to go for it in terms of a progressive staking plan on the strategy.

ruthlessimon · Sat Jan 06, 2018 3:02 pm

See that's a really interesting insight, why aren't there dedicated videos on BA for this type of stuff?

ruthlessimon · Sat Jan 06, 2018 4:23 pm

ShaunWhite wrote: ↑
Sat Jan 06, 2018 2:14 am
There's also the case of resourse management, often the bots aren't mutually compatible or won't sit comfortably alongside manual trading

This is another great topic I'd love to see discussed more

Forbes had a great article about this which I'll try & find; but the general premise for ranking strategies was as follows. There are "3 trading pillars" that need to be accommodated:

1. Edge
2. Frequency
3. Scale

It's a perverse problem. Improving edge, will lead to a lower frequency (HFTs visa versa). Jacking up the scale, damages frequency as well as edge. Also, a deficiency in any pillar will lead to psychological issues.

This is why I get frustrated when "psychology" is blamed. It's getting blamed because there's a lacking in 1 of those 3 pillars. Reading Trading in the Zone for a 4th time isn't going to help! Finding a mentor to discuss those 3 pillars will. It's psychotic behaviour to just accept poor trading. Learning to trade, is learning how to balance those 3 pillars - & the psychology will fall naturally into place. Opps bit of a tangent!

rinconpaul · Sat Jan 06, 2018 7:45 pm

I've wrestled with this "must be logical" to work idea for years. However the very nature of the business model we're in, is that it's self perpetuating because it's "Not Logical". if racing/sports outcomes were logical, the industry would be shut down in an instant!

Take this recent post: viewtopic.php?p=141355#p141355

A horse called Black Mosheen started as favourite and ended up 6th rank in the betting. Logically it'd have no chance of a win, but guess what it did win! That's not logical! But it happens every second of the day somewhere in the world.

I say forget logical, but thoroughly test the method you reckon is profitable a number of ways:
1/ Volume. Is the volume of the trades/bets many times more than the cycle of probability? i.e. My example is a price range averaging $2.50. Every 2.5 bets it should show a win and a loss. My data sample averages nearly 40 samples, 16x the cycle.
2/Now this is where logic comes into play. Plot the profit/loss over the test period on a day of week basis. In my example Mondays broke even. Saturdays made the most profit, closely followed by Wednesdays and Sundays. That's logical as that's the order of frequency (more races on a Sat, least on a Mon).
3/Now plot implied probability vs actual probabilty and a trendline. Is the actual probability well below implied (for lay betting)?
4/Lastly, and this is the gotcha! Put your results through a montecarlo simulation of a 1000 replays of the series. See my post: viewtopic.php?p=140752#p140752

What you see as a winner, might not appear as one after this final test?

ShaunWhite · Mon Jan 08, 2018 3:31 am

I've just caught up with this thread and I must say I'm really liking the quality and depth of the discussion. LinusP,Euler,Ruthless & rinconpaul, all thought provoking stuff.

I'm just going to refer back to my posting showing the graph, and add some info. I'll share it with you for 2 reasons.

1. I can because I'm certain it's not profitable (gee thanks) the output is for 0% commission. I'm confident it's not even on the way to being profitable. It's also extremely basic, so much so I'm almost suprised nobody regognised it. Plus 2017 data is inconclusive to say the least! So help yourself, I'm not pursuing it. If I turn out to have shot myself in the foot with both barrels, so be it.

2. I hope it's a good example of how being irrational leads to being illogical and ultimately nowhere. Specualtion about what I'm seeing would be interesting too whether or not you have a database you can explore. With any luck you'll say you can't reproduce it, then I can accept my data is screwed and I can stop trying to explain it.

I'll try and be brief, not my speciality.

I got the SP 2013-16 data. 2017 was available soon and 5 years seemed enough.
I did some tidying, mins, maxes, blanks etc. Then a basic sanity check by backing every favourite to get this.

Untitled3.png

(Is this what other people see for their 13-16 SP data?, if not then I've wasted several hours!)
I was slightly suprised by the size of the deviations, but with the odd run of wins or loses it wasn't far from what seemed reasonable. What did stand out though was that it appeared to be quite rhythmic, maybe because I used to do quite a bit of music on the pc and you eventually learn to hear(ish) what you're looking at. 200ms of vocal looks different to 200ms of strings.. I digress.

To check the data further I thought I'd count the races per year and did so by putting some year markers on the chart.

Untitled4.png

Now that is suprising, not just a regular rhythm but also a symmetry within each 'bar'. A bit like noise cancelling, but backwards, odd. So lets flip the second half of each loop by laying from july to december instead.

And here we are, where I started.

Untitled5.png

ShaunWhite · Mon Jan 08, 2018 3:39 am

I can now add in 2017

Untitled6.png

You can now see why I'm less keen on putting my shirt on this. It's running out of steam in 2017 and in 2018, who knows. And upturn on a totally irrational, illogical 2.5yrs cycle or continuing a downward trajectory. It's in the tea leaves.

I've obviously tried reversing the strategy from 2016 onwards and fyi if you're on anything bigger than 2% commission then forget it. If you're on 2% then you can do your own homework

but I don't think you'll be impressed with 2017.

So, that's the story of a worthless strategy...BUT regardless of it being a winner, something clearly happened at the end of 2015, what happened?

I've got theories. (Conspiracy theorists here's your chance, and for the gurus, you must have seen this, what happened?)

1. A bot emerged at the end of 15, and in 2017 something else has emerged and it's been nulified. Probably not.

2. Something changed in the BHA rules and it affected 'something' (BHA online changes list only go back a year so can't even begin to check). Probably not.

As I write this I've thought of another to check....

yep 3. Number 3 ...You'll like this one, not a lot, but you'll like it. I hope this might even cause a little ripple (verbal rather than of applause). Drum roll....

Wikipedia : Betfair.... was listed on the London Stock Exchange as Betfair Group plc, until it merged with Paddy Power to form Paddy Power Betfair on 2 February 2016.

and no doubt with some influence before 2 Feb 2016?

Ladies and gentlemen of the jury I rest my case.
But can someone please explain what's actually happening, I don't understand it enough. Is this all known and I've only just caught up with the plot and it's old news? And lets not forget this is a ridiculous illogical strategy back jan-jun, lay july-dec. How is that a window into anything going on at Betfair!

ShaunWhite · Mon Jan 08, 2018 4:40 am

Linus,
I looked at the above and considered the number of monkeys. There are legions of monkeys which makes this pattern highly probably, but as it was the first monkey that did it, and that seems highly improbable too.

LinusP · Mon Jan 08, 2018 6:38 am

From what I can see it looks as though you are seeing mean reversion but by chance it happened to match the timeframe of a year for a few years. This is something that can be taken advantage of most of the time but I am not sure in this example.

rinconpaul · Mon Jan 08, 2018 6:51 am

If we're talking modelling here, I'll say this,"Never base a strategy or backtest on any freely available published data source!" E.g. Betfair SP files. I wasted years doing it. They're full of errors and of no real use live, as you can't ever predict SP. You can react to it if you want to go in play, that's all.

Harvest your own live data and build a database to model. I harvest the last "?" Minutes betting at "?" Second frequency (Back, Lay, LTP & Vol). When you get 6 months worth, you'll see a whole lot of mysterious things? Illogical things! You need to be an illogical thing purveyor and prey on punter/trader errors. Forget old school wisdoms and be contrarian. Be prepared to Dutch and bookmake multi runners in the same race.

ShaunWhite · Mon Jan 08, 2018 1:39 pm

rinconpaul wrote: ↑
Mon Jan 08, 2018 6:51 am
be contrarian.

In daily life that's my middle name, or "trust you to always think the *!$£* opposite and be difficult" according to my missus.
I just need to learn when it means wrt strategies because I'm also cripplingly logical which is a big hinderance.

Not sure if I need time with a statastician or a psychiatrist. Don't answer that one

ShaunWhite · Mon Jan 08, 2018 1:41 pm

LinusP wrote: ↑
Mon Jan 08, 2018 6:38 am
From what I can see it looks as though you are seeing mean reversion but by chance it happened to match the timeframe of a year for a few years. This is something that can be taken advantage of most of the time but I am not sure in this example.

That's pretty much what I woke up thinking.

Anna List · Mon Jan 08, 2018 2:19 pm

ShaunWhite wrote: ↑
Mon Jan 08, 2018 3:31 am
I've just caught up with this thread and I must say I'm really liking the quality and depth of the discussion. LinusP,Euler,Ruthless & rinconpaul, all thought provoking stuff.

I'm just going to refer back to my posting showing the graph, and add some info. I'll share it with you for 2 reasons.

1. I can because I'm certain it's not profitable (gee thanks) the output is for 0% commission. I'm confident it's not even on the way to being profitable. It's also extremely basic, so much so I'm almost suprised nobody regognised it. Plus 2017 data is inconclusive to say the least! So help yourself, I'm not pursuing it. If I turn out to have shot myself in the foot with both barrels, so be it.

2. I hope it's a good example of how being irrational leads to being illogical and ultimately nowhere. Specualtion about what I'm seeing would be interesting too whether or not you have a database you can explore. With any luck you'll say you can't reproduce it, then I can accept my data is screwed and I can stop trying to explain it.

I'll try and be brief, not my speciality.

I got the SP 2013-16 data. 2017 was available soon and 5 years seemed enough.
I did some tidying, mins, maxes, blanks etc. Then a basic sanity check by backing every favourite to get this.
Untitled3.png

(Is this what other people see for their 13-16 SP data?, if not then I've wasted several hours!)
I was slightly suprised by the size of the deviations, but with the odd run of wins or loses it wasn't far from what seemed reasonable. What did stand out though was that it appeared to be quite rhythmic, maybe because I used to do quite a bit of music on the pc and you eventually learn to hear(ish) what you're looking at. 200ms of vocal looks different to 200ms of strings.. I digress.

To check the data further I thought I'd count the races per year and did so by putting some year markers on the chart.
Untitled4.png

Now that is suprising, not just a regular rhythm but also a symmetry within each 'bar'. A bit like noise cancelling, but backwards, odd. So lets flip the second half of each loop by laying from july to december instead.

And here we are, where I started.
Untitled5.png

Someone mentioned Mean Reversion.

I was really into that a few years ago. I thought that it was the answer to all of my prayers or, at least, some of them. Of course, it wasn't.

No disrespect to anyone intended but there are real practical issues involved from reacting to the above charts. I'm sure most of you understand this but those with less experience may not.

From these charts, it looks a simple task to determine when to switch the system on and off or swap from backing to laying and vice versa.

Trust me, it's not.

Let's suppose that we switch the backing system on after the low pint has been reached and switch it off or switch to laying after the high point has been reached. Sounds soooo, simple, doesn't it?

In reality, it isn't because, right now, we are looking at past data which has all been charted. We can see what we should have done and when we should have done it.

At the time, this would have been far less clear because the chart would have been a developing entity, not a completed one. At the macro level and after the event, trends are obvious and therefore can only be reacted to retrospectively. At the micro level and at the time of the event, trends develop only slowly and, it is only after a trend development that one comes to realise what one should have done. Sadly, at the time of the event, trends are not yet developed and therefore how one should react is problematic.

Here's an example. It looks as if an upward trend is developing. Sadly, time has already been lost and bets not made because we were awaiting the development of a trend. So, we begin backing. Ooops, we start to lose. The trend has started to reverse. So, do we stay in and continue backing or do we cease?

If we stay in and continue backing, the downward trend could continue and we could lose even more. On the other hand, the trend could re-reverse and we may start winning - but which?

If we get out, the trend could reverse again and we would have won had we not stopped betting. On the other hand, the downward trend could continue. If we continue backing, we could continue to lose. On the other hand, we could stop backing and save ourselves a fortune. But, which is it to be?

When I experienced all this, I too decided that this is not the way forward.

Euler · Mon Jan 08, 2018 3:47 pm

I always have to come up with a reason why it would work, then when it's deploying I immediately use that assumption to test the opposite. If I get opposing results then I feel much more confident that I'm in the right place. So it's the process of trying to disprove a theory that proves it.

Atho55 · Mon Jan 08, 2018 4:11 pm

It may just be me but I find it helps to project my past stat findings forward then attempt to recreate the rule in Guardian to see if the implementation (Practice mode) matches the stats. It soon becomes clear how it is doing.

Fwd Chart.png

Strategy Development: Modelling

Login • Register