Gathering data - Betfair only provides so much
- marksmeets302
- Posts: 527
- Joined: Thu Dec 10, 2009 4:37 pm
Full book, I haven't moved to streaming yet.
-
- Posts: 165
- Joined: Wed Mar 01, 2017 2:06 pm
Okay so I'm guessing you don't save your data in any sort of relational way? Just as json files which contain all the data you'd need to do your reporting?LinusP wrote: ↑Fri May 26, 2017 6:40 amI've used mongo before, it seems to have a bad reputation when it comes to scalability though, do you store the full book or just the streaming update?
@Welshboy, its a NoSQL db so joins are considered harmful, not sure how tricky it would be to extract certain columns out into pandas from it.
I started planning a few tables and relationships out. Just to see if I could limit data redundancy and hopefully file size.
My current setup is to keep all the current race data in ram and write the json to a file at the end of a race. Im thinking I should have a file per race, then load the ones I want to analyse into pandas and take a look.
I had speed issues when dumping json to a file so switched to dumping and writing per line and then zipping at the end of the race. I am going to wait for betfair to release their historical data and then switch to collecting just streaming data matching there format so I can use both with the same franework.
Hi Linus,LinusP wrote: ↑Mon May 08, 2017 6:26 amIf you have experience in python use my library, betfairlightweight:
https://github.com/liampauling/betfairlightweight
In order to use the API you need an app key, I recommend requesting one (delayed is free) you can then automate you account operations. But I think you will find the race card endpoint handy as it provides an interface to scraping the timeform data betfair displays on the website and does not require an app key.
Code: Select all
pip install betfairlightweight
Code: Select all
from betfairlightweight import APIClient trading = APIClient(username='test', password='test', app_key='test') trading.race_card.login() race_card = trading.race_card.get_race_card(market_ids=['1.1234456'])
Amazing library. I'm not very familiar with Python and have been struggling a little bit but now I've managed to get the historical data into a mysql database with your python code.
Is it possible to also get data from "Market_definition" in a historical stream?
I currently use the following code:
Code: Select all
def on_process(self, market_books):
with open('output.txt', 'a') as output:
for market_book in market_books:
for runner in market_book.runners:
output.write('%s,%s,%s,%s,%s,%s,%s,%s\n' % (
market_book.publish_time, market_book.number_of_active_runners, market_book.market_id, market_book.status, market_book.inplay,
runner.selection_id, runner.total_matched, runner.last_price_traded or '',
))
Code: Select all
def on_process(self, market_books, market_definition):
with open('output.txt', 'a') as output:
for market_book in market_books:
for runner in market_book.runners:
for runner in market_definition.runners:
output.write('%s,%s,%s,%s,%s,%s,%s,%s\n' % (
market_book.publish_time, market_definition_runner.sort_priority, market_book.number_of_active_runners, market_book.market_id, market_book.status, market_book.inplay,
runner.selection_id, runner.total_matched, runner.last_price_traded or '',
))
Thank you very much in advance,
Proffs1
Glad you got it working, the runners are stored in a list so you have to either create a lookup or loop through. Using the original code I would do the following:
If you put that at the top under the for market book loop you can then do the following:
Code: Select all
runner_dict = {runner.selection_id: runner for runner in market_book.market_definition.runners}
Code: Select all
for runner in market_book.runners:
runner_def = runner_dict.get(runner.selection_id)
sort = runner_def.sort_priority
Wow, thank you very much. It works!LinusP wrote: ↑Tue Sep 19, 2017 6:18 amGlad you got it working, the runners are stored in a list so you have to either create a lookup or loop through. Using the original code I would do the following:
If you put that at the top under the for market book loop you can then do the following:Code: Select all
runner_dict = {runner.selection_id: runner for runner in market_book.market_definition.runners}
Code: Select all
for runner in market_book.runners: runner_def = runner_dict.get(runner.selection_id) sort = runner_def.sort_priority
Hi There,
Is there any chance you can post the whole code?
I tried to do the same thing but got an error.
I'm not sure what sort variable does as my interpreter is saying its unused?
Thanks,
Is there any chance you can post the whole code?
I tried to do the same thing but got an error.
I'm not sure what sort variable does as my interpreter is saying its unused?
Thanks,
Code: Select all
def on_process(self, market_books):
with open('output.txt', 'a') as output:
for market_book in market_books:
runner_dict = {runner.selection_id: runner for runner in market_book.market_definition.runners}
for runner in market_book.runners:
runner_def = runner_dict.get(runner.selection_id)
sort = runner_def.sort_priority
for runner in market_book.runners:
output.write('%s,%s,%s,%s,%s,%s,%s,%s,%s\n' % (
market_book.publish_time, market_definition_runner.sort_priority, market_book.number_of_active_runners, market_book.market_id, market_book.status, market_book.inplay,
runner.selection_id, runner.total_matched, runner.last_price_traded or '',
))
I checked the indentation and its fine. It's prob just the way I posted it in. I think I need to edit the base resource.py file to add the call into a dictionary
runner_def = market_def.runners_dict.get((runner.selection_id , runner.handicap, runner.event_name))
AttributeError: 'RunnerBook' object has no attribute 'event_name'
that's the error I'm getting with the following code
runner_def = market_def.runners_dict.get((runner.selection_id , runner.handicap, runner.event_name))
AttributeError: 'RunnerBook' object has no attribute 'event_name'
that's the error I'm getting with the following code
Code: Select all
from betfairlightweight import APIClient
from betfairlightweight.streaming import StreamListener, MarketStream
import os
class HistoricalStream(MarketStream):
def __init__(self, listener):
super(HistoricalStream, self).__init__(listener)
print('Time,MarketId,Status,Inplay,sortPriority,runnerName,LastPriceTraded\n')
def on_process(self, market_books):
for market_book in market_books:
for runner in market_book.runners:
market_def = market_book.market_definition
runner_def = market_def.runners_dict.get((runner.selection_id , runner.handicap, runner.event_name))
print(runner_def.name , runner_def.handicap, runner.selection_id, runner.last_price_traded,
runner.event_name)
class HistoricalListener(StreamListener):
def _add_stream(self, unique_id, stream_type):
if stream_type == 'marketSubscription':
return HistoricalStream(self)
apiclient = APIClient('aa', 'bb', 'cc')
stream = apiclient.streaming.create_historical_stream(
directory='/Users/mac/PycharmProjects/xbot/sample2',
listener=HistoricalListener(max_latency=1e100))
stream.start(async=False)
Event name is in the market definition:
Code: Select all
market_book.market_definition.event_name