Gathering data - Betfair only provides so much

Post Reply
clomax8
Posts: 21
Joined: Sat Apr 08, 2017 12:01 am

I'm cataloguing my trading history into a proper SQL database from the XLS files Betfair provides in order to build a tool that I can use to visualise my performance, make improvements, etc. I'm automating as much of it as I can with a Python script. The problem is that Betfair's history doesn't tell me things like whether or not a given race was at an AW track, or what channel it was on, or what the prize money was so I'll have to find information like that from another source.

As much as I don't like to rely on scraping web pages this site has some of what I'm after. The page source looks like it would be a bit of a ball-ache to scrape it in my Python script:
http://racing.betting-directory.com/res ... y-2017.php

And I would have to scrape a number of pages that each have a subset of what I'm looking for and will undoubtedly have some downtime. So, it's a long shot, but is there something like a (free) API that I can make a simple POST request to get the info I'm after? Or does anyone have any general advice on gathering data? I'm not entirely sure that I'm really doing anything useful by collecting this data, tbh, so is my thinking on the right path here?
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

If you have experience in python use my library, betfairlightweight:

https://github.com/liampauling/betfairlightweight

In order to use the API you need an app key, I recommend requesting one (delayed is free) you can then automate you account operations. But I think you will find the race card endpoint handy as it provides an interface to scraping the timeform data betfair displays on the website and does not require an app key.

Code: Select all

pip install betfairlightweight

Code: Select all

from betfairlightweight import APIClient

trading = APIClient(username='test', password='test', app_key='test')

trading.race_card.login()
race_card = trading.race_card.get_race_card(market_ids=['1.1234456'])
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

@Linusp,

Love your library! Had half written my own, before thinking to check out github and saw it there.
At the moment I'm taking the data and loading it into a sqlite db using PeeWee.

Have just one question, how do you do it?
I've got a few tables with race info, horses, prices, all exchange prices and volume etc..But I'm noticing the size will become an issue pretty soon. One days worth is easily over 3gb.
Is this normal? I'm thinking I may reduce the amount of data I'm capturing. e.g. Only to capture best 3 lay/back prices and LTP. Not the entire range of prices and volumes available etc..

My end goal is to query the data and shape it into a pandas df for analysis, no sure if I should just save each race or day as a csv for easy importing, or as a json file. Or stay with a db

Thoughts?
Appreciate if you don't want to give away too much.
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

No problem, getting a lot of contributors now which is helping iron out bugs.

I use a MySQL box on AWS for order and some market data but like you faced scalability issues when storing MarketBooks so I now zip up the raw json and store in s3 (AWS) for back testing / processing later.
User avatar
HRacing
Posts: 278
Joined: Tue May 14, 2013 11:25 am

Just trying to get the hang of this Python game, got to say it seems alright to be fair, but trying to login to LinusP betfairlightweight I get an exception saying... certificate folder not found in /certs/... Just wondered if the Python boys no a way round this
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

Have you created the certificates yet?
User avatar
HRacing
Posts: 278
Joined: Tue May 14, 2013 11:25 am

I did a quick search (no idea what the certifcates are :roll:) and only found something very detailed. Is there a quick explanation or is it a long process? If it is leave it Liam as iv asked for more than enough advice from you this week!! :)
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

HRacing wrote:
Wed May 24, 2017 5:09 pm
I did a quick search (no idea what the certifcates are :roll:) and only found something very detailed. Is there a quick explanation or is it a long process? If it is leave it Liam as iv asked for more than enough advice from you this week!! :)
Yeh creating certificates is probably the hardest part of using the API, especially on windows!

https://github.com/liampauling/betfairl ... wiki/Setup

You will need an appKey as well, the delayed one is free but next to useless, its £299 for full access now :evil:
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

LinusP wrote:
Wed May 24, 2017 7:07 pm
HRacing wrote:
Wed May 24, 2017 5:09 pm
I did a quick search (no idea what the certifcates are :roll:) and only found something very detailed. Is there a quick explanation or is it a long process? If it is leave it Liam as iv asked for more than enough advice from you this week!! :)
Yeh creating certificates is probably the hardest part of using the API, especially on windows!

https://github.com/liampauling/betfairl ... wiki/Setup

You will need an appKey as well, the delayed one is free but next to useless, its £299 for full access now :evil:
Yeah I'm glad I had the sense to apply before they started charging, even though I didn't have much use for it then!
spreadbetting
Posts: 3140
Joined: Sun Jan 31, 2010 8:06 pm

I've no experience with python , can you use it with a GUI or is it all command line based stuff?
User avatar
HRacing
Posts: 278
Joined: Tue May 14, 2013 11:25 am

welshboy06 wrote:
Wed May 24, 2017 7:23 pm
LinusP wrote:
Wed May 24, 2017 7:07 pm
HRacing wrote:
Wed May 24, 2017 5:09 pm
I did a quick search (no idea what the certifcates are :roll:) and only found something very detailed. Is there a quick explanation or is it a long process? If it is leave it Liam as iv asked for more than enough advice from you this week!! :)
Yeh creating certificates is probably the hardest part of using the API, especially on windows!

https://github.com/liampauling/betfairl ... wiki/Setup

You will need an appKey as well, the delayed one is free but next to useless, its £299 for full access now :evil:
Yeah I'm glad I had the sense to apply before they started charging, even though I didn't have much use for it then!
Cheers for the help anyway lads, may have to revert to oldschool excel vba but that can be effective in its own way :)
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

HRacing wrote:
Wed May 24, 2017 10:14 pm
welshboy06 wrote:
Wed May 24, 2017 7:23 pm
LinusP wrote:
Wed May 24, 2017 7:07 pm


Yeh creating certificates is probably the hardest part of using the API, especially on windows!

https://github.com/liampauling/betfairl ... wiki/Setup

You will need an appKey as well, the delayed one is free but next to useless, its £299 for full access now :evil:
Yeah I'm glad I had the sense to apply before they started charging, even though I didn't have much use for it then!
Cheers for the help anyway lads, may have to revert to oldschool excel vba but that can be effective in its own way :)
Well to he honest, I'm not 100% sure about betfairs rules but I don't run my python data collecting app if I'm trading. Just in case. So I'm going to start collecting data in excel as well as try out some automation.
So excel isn't all bad
User avatar
marksmeets302
Posts: 527
Joined: Thu Dec 10, 2009 4:37 pm

I use a MySQL box on AWS for order and some market data but like you faced scalability issues when storing MarketBooks so I now zip up the raw json and store in s3 (AWS) for back testing / processing later.
Liam, have a look at mongodb. It's perfect for storing json objects: you read them in from betfair and without parsing just push them to the db. It's amazingly fast and seems to compress the data on its own. Just checked, my database holds 23 million objects and is now 330GB in size. Not much in this day and age. When backtesting you read the json objects again, but this time from the database and continue as you normally do.
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

marksmeets302 wrote:
Thu May 25, 2017 9:06 am
I use a MySQL box on AWS for order and some market data but like you faced scalability issues when storing MarketBooks so I now zip up the raw json and store in s3 (AWS) for back testing / processing later.
Liam, have a look at mongodb. It's perfect for storing json objects: you read them in from betfair and without parsing just push them to the db. It's amazingly fast and seems to compress the data on its own. Just checked, my database holds 23 million objects and is now 330GB in size. Not much in this day and age. When backtesting you read the json objects again, but this time from the database and continue as you normally do.
I've heard of mongodb and briefly looked at it. I'll take another look on the weekend though.
Is it an ORM or will I need to write complex sql queries to join tables and get the data into a pandas dataframe?
Or do you use something other than pandas?

Cheers
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

I've used mongo before, it seems to have a bad reputation when it comes to scalability though, do you store the full book or just the streaming update?

@Welshboy, its a NoSQL db so joins are considered harmful, not sure how tricky it would be to extract certain columns out into pandas from it.
Post Reply

Return to “Betfair Data”