Strategy Development: Modelling

A place to discuss anything.
Post Reply
User avatar
northbound
Posts: 737
Joined: Mon Mar 20, 2017 11:22 pm

LinusP wrote:
Sat Oct 28, 2017 9:00 am
[*]After 20/30 markets, get bored of backtesting, if its in profit or hovering around £0 push to UAT
Does it mean that you backtest your strategy on 20/30 markets only?

Or do you have an algorithm that, after setting your strategy’s signals, can backtest thousands of markets in a couple of seconds?
User avatar
mjmorris335
Posts: 180
Joined: Mon Jun 06, 2016 11:29 am

LinusP wrote:
Sat Oct 28, 2017 9:00 am

If people are interesting I can provide some data, i.e. last weeks racing, and go through the process of setting up a database, loading, querying etc.
This would be hugely beneficial and very gratefully received.

I did have an opportunity to work with SQL on an old VAX system 25 years ago. I remember it being a royal pain in the arse setting up the tables but incredibly powerful when querying the database. No doubt things have marched on a pace by now.

Mike
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

Megarain, yep backtesting is the hard part and can be very high on CPU especially when using high level languages like python.

Northbound, at present it takes around an hour for me to backtest 30 racing markets end to end when not runnng concurrently (serverless reduces this to 2ish minutes). But as mentioned by Peter any results should not be taken as golden and more as a guide in potential. To be honest backtesting is more of a integration test for me so that I can see expected order count and if the strategy is going to do anything silly.

However having data gives you a starting point so that you can implement something and start iterating, making profit on version 1 isn’t going to happen.

Might see if here are any companies still around where I can load a market and go through the basics in SQL, mode analytics used to do this I think. As you guys are right, the time consuming part is data quality and loading. Is sql express really 10tb? I thought it was 2gb?
User avatar
northbound
Posts: 737
Joined: Mon Mar 20, 2017 11:22 pm

LinusP wrote:
Sat Oct 28, 2017 1:29 pm
But as mentioned by Peter any results should not be taken as golden and more as a guide in potential
Agree
User avatar
megarain
Posts: 2041
Joined: Thu May 16, 2013 1:26 pm
Contact:

Might see if here are any companies still around where I can load a market and go through the basics in SQL, mode analytics used to do this I think. As you guys are right, the time consuming part is data quality and loading. Is sql express really 10gb? I thought it was 2gb?
There are different flavours of SQL .. and cost etc. The free version of 2014 - Microsoft SQl server, , has a 10gb database size limit.

2005/8 express was 4gb.
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

At the moment ive got my data in an sqlite db. Free and easy to use, uses sql too.

Using peewee with it to map the tables and items to objects in python so i dont have to fuss with writing sql
bigfev
Posts: 5
Joined: Tue Sep 05, 2017 3:35 am

Hi guys

A very interesting thread. Personally I am newish to trading and have not yet developed any real knowledge of modelling. I am willing to have a crack at it but as a real novice when it comes to IT language I wouldn't mind a bit of help with the below terms so I can better understand what you are all discussing...

1) What is a Python?
2) What is a SQL/VAX?
3) When you say back testing will use high CPU does that mean you need a very fast computer with heaps of memory?
4) When you talk about backtesting are you re-watching the market as if it was in-play or are you just going off historical figures provided by betfair/external source?
5) When talking about creating algorithms do you mean using excel to create formula's?

Thanks!
User avatar
EyePeaSea
Posts: 258
Joined: Sun Jun 12, 2011 11:18 am

bigfev wrote:
Wed Nov 01, 2017 11:34 am
1) What is a Python?
Python is a computer language that several people on here use to create models/bots (outside of BetAngel). It's a multi-threaded language that's fast and also well know for being logical and readable at the same time. https://en.wikipedia.org/wiki/Python_(p ... _language)
bigfev wrote:
Wed Nov 01, 2017 11:34 am
2) What is a SQL/VAX?
SQL is a generic term for a database / database access method. VAX is a computer system. Common SQL databases are Microsoft SQL, Oracle and then some free ones like MySQL, SQLite, etc. etc.
A Database is a great place to store data so that you can play around with it. You can do the same thing in Excel, for small amounts of data, but when you need speed and flexibility, then people generally go for a Database.
bigfev wrote:
Wed Nov 01, 2017 11:34 am
3) When you say back testing will use high CPU does that mean you need a very fast computer with heaps of memory?
It depends on what you are doing. I backtest against 14,000 races with 580 different 'WhatIf' permutations - and that takes a lot of computer power! :D
bigfev wrote:
Wed Nov 01, 2017 11:34 am
4) When you talk about backtesting are you re-watching the market as if it was in-play or are you just going off historical figures provided by betfair/external source?
Both.
Backtesting is a term normally used to describe taking a set of data (e.g. 1000 races) and see what the results would be if you'd done something different (basically, "What if" scenarios). Used a lot for BOTs (automated programs that do the betting/trading).
Some people do re-run race videos (or just look at the raw data as it changes over time) to hone their skills - "Should I have seen this / Should I have done that". That's for people doing live trading rather than using a BOT.
bigfev wrote:
Wed Nov 01, 2017 11:34 am
5) When talking about creating algorithms do you mean using excel to create formula's?
I think that this is more meant to describe an automated strategy. E.g. "how" people trade. A formula might be "Back horses > 3/1 and Lay if horse odds fall to < 2/1". That can be done in Excel VBA, another language such as Python or the built in capabilities of BetAngel (there are some real experts on this, like Dallas). Can be done in Excel formulas, but for anything that gets quite clever/complicated, then using a programming language rather than just formulas is the way that a lot of people go.


HTH

Ian
User avatar
ShaunWhite
Posts: 9731
Joined: Sat Sep 03, 2016 3:42 am

This thread seemed an appropriate place to post this.

I was backtesting something on the 2013-2016 data and ended up with the results below.

You can see from that graph why I was questioning the 2016 data on the other thread now !! :evil: WTF happened in '16 to wreck 3 solid years? (mostly a rhetorical question).

You'd think that if something was performing for 37,000 races you'd cracked it. It always makes me smile when I hear new guys taking about running automation 'all day' and then bursting into tears or getting the exotic holiday brouchures out. The should look at the duration of some of the dips in the graph below even when it was performing well.

Really hope we find the '17 data appearing somewhere soon, anyone seen it? 2018 data would be even better ;)
You do not have the required permissions to view the files attached to this post.
User avatar
Derek27
Posts: 23477
Joined: Wed Aug 30, 2017 11:44 am
Location: UK

More precisely, what happened in October 2015 ?

That seems to be the point where the graph reverses, and it's only going one way.
User avatar
ShaunWhite
Posts: 9731
Joined: Sat Sep 03, 2016 3:42 am

Derek27 wrote:
Fri Jan 05, 2018 8:14 pm
More precisely, what happened in October 2015 ?

That seems to be the point where the graph reverses, and it's only going one way.
That bit is only as bad as the end of '14, and end of 13 to some extent. It's a seasonal strategy, with flaws.
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

ShaunWhite wrote:
Fri Jan 05, 2018 7:40 pm
This thread seemed an appropriate place to post this.

I was backtesting something on the 2013-2016 data and ended up with the results below.

You can see from that graph why I was questioning the 2016 data on the other thread now !! :evil: WTF happened in '16 to wreck 3 solid years? (mostly a rhetorical question).

You'd think that if something was performing for 37,000 races you'd cracked it. It always makes me smile when I hear new guys taking about running automation 'all day' and then bursting into tears or getting the exotic holiday brouchures out. The should look at the duration of some of the dips in the graph below even when it was performing well.

Really hope we find the '17 data appearing somewhere soon, anyone seen it? 2018 data would be even better ;)
Interesting, what does 2017 look like?

viewtopic.php?f=54&t=13157&p=141407#p141407

When I see this when backtesting it is normally due to a hidden lookahead bias that creeps in and is slowly removed when time gets closer to real time.

Edit:

Relavent article just posted by Buchdahl:

https://www.pinnacle.com/en/betting-art ... 3vjy59y4q3
Last edited by LinusP on Fri Jan 05, 2018 9:47 pm, edited 1 time in total.
User avatar
ruthlessimon
Posts: 2094
Joined: Wed Mar 23, 2016 3:54 pm

ShaunWhite wrote:
Fri Jan 05, 2018 7:40 pm
This thread seemed an appropriate place to post this.
It's a brilliant post & highlights a couple of really interesting points.

Has the edge actually become an anti-edge? My initial thoughts when seeing the graph was that if the edge had failed, performance should pretty much flatline(ish). Imagine a new trader begins trading in 2016 with your exact idea - but reversed entry signals - he's killing it in 2016! Now here comes the flaws of good psychology, a trader following the edge perfectly would do his arse indefinitely through the 2016 period - after all, he has 3 solid years of good data to fall back on. But I'd say it would be reckless to follow the plan having drawdown by such a distance imo. Therefore, the real edge is knowing when to jump ship; abandon the plan & move onto new ideas. That is something incredibly difficult to master.
User avatar
ShaunWhite
Posts: 9731
Joined: Sat Sep 03, 2016 3:42 am

ruthlessimon wrote:
Fri Jan 05, 2018 9:25 pm
ShaunWhite wrote:
Fri Jan 05, 2018 7:40 pm
This thread seemed an appropriate place to post this.
That is something incredibly difficult to master.
Yep, que será, será. (little nod there to a tune from 1956 for Memphis)

I'm starting to wonder if finally getting my data together is a good thing or not.

You sure can make some pretty looking graphs, but the question will always be how far dare you let it run against you. There's also the case of resourse management, often the bots aren't mutually compatible or won't sit comfortably alongside manual trading and, even though I'm lucky enough to have 2 accounts, one of them might have to run for months before it's true colours are shown.

I'd be interested to hear any views on how to measure if a strategy is worth pursuing. It's clearly not just a matter of an upward trend, it has to be judged by the amplitude and frequency of the variation (noise). I'm not a stato as I've said before so forgive any clumsy terminology.

Is there a technical term or function for the maximum interval between new highs? That sounds like a reasonable staring point, then use the frequency of races that fit your criteria to work out what this is in elapsed time. Personally I think I'd want to be hitting a new high at least once every couple of weeks with a worst case downside in that period that wouldn't wipe out perhaps the last 4. I'm just making a bit of a finger in the air guess there.
Last edited by ShaunWhite on Sat Jan 06, 2018 2:19 am, edited 1 time in total.
User avatar
ShaunWhite
Posts: 9731
Joined: Sat Sep 03, 2016 3:42 am

LinusP wrote:
Fri Jan 05, 2018 9:14 pm
Interesting, what does 2017 look like? viewtopic.php?f=54&t=13157&p=141407#p141407

Relavent article just posted by Buchdahl: https://www.pinnacle.com/en/betting-art ... 3vjy59y4q3
Are you trying to kill me? I'd planned an early night and you send me a link to data and something to read. Thanks a lot. :roll:

Edit....I've got a feeling that this subject might start to head in this direction ...

If we set about devising a betting system via data dredging until we find criteria that are profitable, we risk failing to establish causal explanations for what we find (Buchdahl)
Post Reply

Return to “General discussion”