Streaming Data API

Post Reply
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Question for the folks who write their own programming code and connect to the Streaming Data API:

Do you use data from ExAllOffers, ExBestOffers or ExBestOffersDisp. I'm currently relying on ExBestOffersDisp because I'm assuming that it's worth having cross-matching data on to see the "fullest" market possible, but now I'm wondering if I should use ExAllOffers after all. I haven't backtested against the other data streams yet, but I was just interested if anyone else has spent any time thinking/testing this. Of course which you pick might only be relevant to your strategy, but I'm still interested. Here's a few points I'm thinking:

1. All streams seem to update at the same time, so there's no speed advantage based on when you get the data.
2. ExBestOffersDisp shows cross-matching money, which perhaps for strategies which take liquidity might be an advantage to see?
3. I used to think using only a few levels for ExBestOffersDisp would require much less data than ExAllOffers, but that's not the case! Since, when level 0 (front of the book) changes in price that's a single value updated for ExAllOffers, but for ExBestOffersDisp you get an update for all the levels because level 0 is now level 1, level 1 is now level 2, etc... So ExAllOffers receives less data, but also requires less processing time.
4. Based on the reasoning in 3, I see absolutely no reason to ever use ExBestOffers - unless for some reason you only want 1 or 2 levels of the book.

In the past I've been quite careful not to subscribe to any more data than I absolutely require, but in future I'll probably gather a bit more so that I can backtest and see if it makes any difference which one I use for my strategies. Perhaps the difference will be negligible, but who knows.
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

Point 3 is interesting although I think you will get a lot more data to process if selecting all offers as the book is constantly updating due to offsets etc.

I request best offers (ladder levels 3), I am not interested in display as I don't trust the data returned, not sure what happens when you request it and xm is off? There are times when xm can hide opportunities and I am not completely sure that it offers betters the best price every time, especially when the market is volatile.

Would be interested to hear of the results if you do compare.
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Thanks for your input. I definitely agree cross-matching is a bit of a black box. Not sure if it's something that's beneficial or detrimental to me until I run a backtest.

As an initial measure of the bytes of data for maintaining ExAllOffers vs ExBestOffersDisp/ExBestOffers for the first race today for the favourite - I measured about 500kb length of data for the ExAllOffers stream against just under 2,000kb for ExBestOffersDisp with 10 levels. I think restating of all of the levels if just one price at the front changes is working out as a lot more data than things changing lower down in the book.

Not sure any of this has any noticeable impact on running strategies, but does make me inclined to use ExAllOffers given it's less data and includes all the levels should I ever need it in future. However if ExBestOffersDisp turns out to be more beneficial to me, then that would be a bigger trade-off.
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

That's very interesting and tbh I find it hard to believe! Out of interest how did you measure this / what was the time frame, pre-play?
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Yes, sorry, that was pre-race for that first race. There would probably be less of a difference in-play.

I didn't do anything very technical. I write all stream updates to a database, and I just pulled them out for one runner and measured the length.

I see a lot of updates like the following for ExAllOffers:
1.96,0

With an equivalent for ExBestOffersDisp:
0,1.94,14.07;1,1.93,19;2,1.92,56.31;3,1.91,33.93;4,1.9,59.26;5,1.89,6;6,1.88,133;7,1.87,31.83;8,1.86,7.45;9,1.85,4

I'm sure you do get changes further down the book for ExAllOffers (beyond the front 10 levels), but those don't seem to outweigh the data to maintain level information in BestOffers.
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

I guess the amount of levels requested is going to impact this as 10 is quite a lot. I have setup two ec2 nano instances for today's racing, one with ALL_OFFERS and another with just 3 ladder levels, will post a screenshot of the metrics this evening.

It would be nice if betfair offer a bit more information on this subject but then its gonna be market dependent.
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Ok, that will be interesting. I expect they'll be roughly the same amount of data. Like you say, probably varies by sport. I would have thought something like football will constantly have the front of the book changing due to the way prices decay.

It's interesting that Betfair changed the default for the number of levels from 6 to 10 (which is apparently the max anyway).
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

Ok the test didn't go to plan, I ran out of CPU credit in less than 30 minutes when requesting the following for all of todays racing (blue):

Code: Select all

fields=['EX_ALL_OFFERS', 'EX_TRADED', 'EX_TRADED_VOL', 'EX_LTP', 'EX_MARKET_DEF']
Base (orange):

Code: Select all

fields=['EX_BEST_OFFERS', 'EX_TRADED', 'EX_TRADED_VOL', 'EX_LTP', 'EX_MARKET_DEF'],
ladder_levels=3
screenshot.png
Maybe its my framework and the number of updates that come through which then have to be processed that is causing the high CPU but then the 'network in' is double that of the base.
You do not have the required permissions to view the files attached to this post.
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Hmm.. interesting. Thanks for the plot. Was it actually running for the whole of that graph then, or did the blue ALL_OFFERS stream just stop when that line dropped down?

Also, any idea what all that Network Out is for the orange line?
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

I let it continue but once it was out of CPU credits it dropped down to 5% for the rest of the afternoon, this caused a major backlog of events to be processed. By the time I stopped the program I was getting latency of around 300s (current time - betfair publish time)

I would ignore the network out as that is the program sending zipped market book csv's to S3 after each race, so the test wasn't in 'ideal' conditions but gives some guidance
PeterLe
Posts: 3715
Joined: Wed Apr 15, 2009 3:19 pm

Hi James,
Just curious as to whether you drew any conclusions on ExBestOffersDisp v ExBestOffers ?
I have the ability to switch between the two..an extract of my code states ;-
if (cbVirtualPrices.Checked)
{
marketDataFilter.Fields = new List<MarketDataFilter.FieldsEnum?> {MarketDataFilter.FieldsEnum.ExBestOffersDisp, MarketDataFilter.FieldsEnum.ExMarketDef};
}
else
{
marketDataFilter.Fields = new List<MarketDataFilter.FieldsEnum?> {MarketDataFilter.FieldsEnum.ExBestOffers, MarketDataFilter.FieldsEnum.ExMarketDef};

Truth is, I have a bug in the code somewhere that when I select Virtual prices, the fill/kill doesnt seem to work and I tend to be in agreement with Liam in that I tend not to use virtual prices anyway. So Ive just left Virtual prices switched off and the code runs just fine.

Ive always had this nagging doubt in my mind as to what would happen if I were to run two of my accounts, same settings, but one on Virtual prices and the other not for a week or so just to compare..My gut feeling would think that I will fire in a lot less bets and the result would be worse with XM on
Just keen to know what conclusions, if any you drew in th end?
Thanks
regards
Peter
LinusP
Posts: 1871
Joined: Mon Jul 02, 2012 10:45 pm

Hi Peter, I think there is a delay in virtual prices, can't remember the exact latency but likely to impact strategies.

I have since spent a lot of time optimising my streaming code so will probably have another go at comparing market data parameters.
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Hi Peter,

I don't actually have a definitive answer for you I'm afraid. I recall doing some backtesting and found that for some of my strategies it performed better with virtual prices off, but some performed better with it on. You probably need to do what you suggested and run a proper live split test over a reasonably long period of time to see which performs better.

Which is better for you may depend on things like whether you take or offer prices. Take may be better with virtual prices on if that results in it seeing more opportunities. Offer could also potentially better if it means you're choosing a price which undercuts a virtual price because you'll be at the front of the queue which otherwise you may not have been if you ignored crossmatching (which would likely take priority if it arrived sooner). However, like LinusP says, if there's a small delay to virtual prices then that might overall be better.

It's hard to say, so you probably need to test specifically for your strategies to determine which is better. For me, I don't think there's a massive difference.
PeterLe
Posts: 3715
Joined: Wed Apr 15, 2009 3:19 pm

Thanks for the replies gents
Ill give it a go running a split test, although sometimes you can run the exact same program on two different accounts (same server) and still get varying results.
Ill see where it takes me anyway
Regards
Peter
xitian
Posts: 457
Joined: Fri Jul 08, 2011 2:08 pm

Yeah, I think because things can be so time sensitive (especially in-play) you'll always get some variation even under the exact same environment. That's why you'll probably need to run your split test over maybe a month of races to notice if there's any measurable difference to p&l at all.
Post Reply

Return to “Betfair Exchange API”