Football Data (CSV, JSON) - UPDATED 16/08/17

Post Reply
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

jonnyg wrote:
Sun Aug 13, 2017 2:49 pm
welshboy06 wrote:
Sun Aug 13, 2017 9:45 am
Hi All,

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
The data contains the following info.
League
Season
Game Date
Home Team
Away Team
And also Goals and Red cards (Player, Time and Team)


At the moment I've only gotten around to scraping the BPL seasons that have all goal data on hkjc (04-05 to 16-17) I may also add in corners, however that would be total for the game, not timings.

More Leagues and seasons will be coming soon, but I'm pretty busy with work etc atm.

The file is a .txt file stored in the below zip (Couldn't upload .txt directly) and is less than 2mb! So should be easy to read and manipulate on anyones setup.

Just to note: The data is all scraped from HKJC, so if there are any errors it would be down to the data they provided.

BPL.zip

Now for one of the main reasons I did this, @jonnyg Asked me a question in another thread...
when you say easy ?

how easy ?





Well the answer to that question is 3.5gls (Total goals / Total Games)

Since the 2011-2012 season, the amount of games where the Home team scored first and the First goal was scored On or Before the 8th minute...
Total Games: 84 Total Goals: 294 Average Goals: 3.5
Min Goals: 1
Max Goals: 9

Cheers,
Adam

the key point is that how many people have the ability to be aware instantly that the data was way off ? :idea:
User avatar
Euler
Posts: 24806
Joined: Wed Nov 10, 2010 1:39 pm
Location: Bet Angel HQ

Lot's people have built models around key metrics that date back for many years. I started looking at football in the 80's.

It doesn't help just posting up reams of stuff. Most people will take data put it into a model and pop out an equation. Just no point in cluttering up threads endlessly. It's spoiling discussion.
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

Euler wrote:
Sun Aug 13, 2017 4:40 pm
Lot's people have built models around key metrics that date back for many years. I started looking at football in the 80's.

It doesn't help just posting up reams of stuff. Most people will take data put it into a model and pop out an equation. Just no point in cluttering up threads endlessly. It's spoiling discussion.

Well at least in my final post i showed how easy it is to scrape the wrong data :!:
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

jonnyg wrote:
Sun Aug 13, 2017 4:38 pm
jonnyg wrote:
Sun Aug 13, 2017 2:49 pm
welshboy06 wrote:
Sun Aug 13, 2017 9:45 am
Hi All,

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
The data contains the following info.
League
Season
Game Date
Home Team
Away Team
And also Goals and Red cards (Player, Time and Team)


At the moment I've only gotten around to scraping the BPL seasons that have all goal data on hkjc (04-05 to 16-17) I may also add in corners, however that would be total for the game, not timings.

More Leagues and seasons will be coming soon, but I'm pretty busy with work etc atm.

The file is a .txt file stored in the below zip (Couldn't upload .txt directly) and is less than 2mb! So should be easy to read and manipulate on anyones setup.

Just to note: The data is all scraped from HKJC, so if there are any errors it would be down to the data they provided.

BPL.zip

Now for one of the main reasons I did this, @jonnyg Asked me a question in another thread...




the key point is that how many people have the ability to be aware instantly that the data was way off ? :idea:
The data wasn't way off. The code which drew the conclusion was (Which I've stated in a previous post). As far as I'm aware the data on the hkjc site is accurate. That's where I scraped it from.

If you have all of the data collected already, then please zip it up and post it. I'm sure it would help a lot of people

Cheers,
Adam
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Dallas wrote:
Sun Aug 13, 2017 3:48 pm
Dallas wrote:
Sun Aug 13, 2017 3:16 pm
jonnyg wrote:
Sun Aug 13, 2017 2:49 pm
"can you tell me for example what is the average goal production since 2011-2012 in games where the home team in the PL opened the scoring on 8 minutes ?
Since 2011-2012 = 3.73
11-12 = 3.28
12-13 = 4.0
13-14 = 4.25
14-15 = 4.25
15-16 = 3.66
16-17 = 3.37
In case anyone want the actual data before the thread gets cluttered with it typed out

*Edit* looks like i was to late
I was having a look over this spreadsheet and scratching my head, trying to figure out why our numbers dont match. Looks like some of the games in the spreadhseet are a day out, either that or hkjc is out.
Liverpool v Fulham - 01/05/2012 on the spreadhseet
But has a date of 02/05/2012 on hkjc. I picked out a few others. But I suppose the date of a game doesn't really matter too much (Being only 1 day out)

Thanks for the spreadsheet, it's helping me double check my findings.

Adam
User avatar
Dallas
Posts: 22713
Joined: Sun Aug 09, 2015 10:57 pm
Location: Working From Home

welshboy06 wrote:
Sun Aug 13, 2017 5:12 pm
I was having a look over this spreadsheet and scratching my head, trying to figure out why our numbers dont match. Looks like some of the games in the spreadhseet are a day out, either that or hkjc is out.
Liverpool v Fulham - 01/05/2012 on the spreadhseet
But has a date of 02/05/2012 on hkjc. I picked out a few others. But I suppose the date of a game doesn't really matter too much (Being only 1 day out)

Thanks for the spreadsheet, it's helping me double check my findings.

Adam
I would'nt worry too much about it, all data sources are going to have very minor differences, the date issue could be on those matches that were on in the evening UK time but are appearing as next day where HKJC is based
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

jonnyg wrote:
Sun Aug 13, 2017 2:49 pm
welshboy06 wrote:
Sun Aug 13, 2017 9:45 am
Hi All,

I've decided to start collecting football data. Mainly because I have an interest in the sport and also because of a certain jonnyg throwing stats around in a very hard to read format.

The data is quite simple and was just scraped from HKJC.

The data is in json format, but you should be able to convert it to csv and import it to Excel. I chose json as it reduces data duplication, is structured and can be read reasonably well by the human eye. It also works really well with Python (Which I'm using to scrape and also analyse the data)
The data contains the following info.
League
Season
Game Date
Home Team
Away Team
And also Goals and Red cards (Player, Time and Team)


At the moment I've only gotten around to scraping the BPL seasons that have all goal data on hkjc (04-05 to 16-17) I may also add in corners, however that would be total for the game, not timings.

More Leagues and seasons will be coming soon, but I'm pretty busy with work etc atm.

The file is a .txt file stored in the below zip (Couldn't upload .txt directly) and is less than 2mb! So should be easy to read and manipulate on anyones setup.

Just to note: The data is all scraped from HKJC, so if there are any errors it would be down to the data they provided.

BPL.zip

Now for one of the main reasons I did this, @jonnyg Asked me a question in another thread...
when you say easy ?

how easy ?





Well the answer to that question is 3.5gls (Total goals / Total Games)

Since the 2011-2012 season, the amount of games where the Home team scored first and the First goal was scored On or Before the 8th minute...
Total Games: 84 Total Goals: 294 Average Goals: 3.5
Min Goals: 1
Max Goals: 9

Cheers,
Adam

the games in that sample since 2011-2012 is bigger then 84 <

since 14-15 in the PL in games where the home team opened the scoring on or before 8 minutes = 100 so 84 since 2011-2012 is way off
User avatar
jonnyg
Posts: 691
Joined: Wed Jan 18, 2017 8:11 pm

that is it from me > I think I have inspired a few to look at the early goal < do not take a blanket approach to the early goal > look at individual leagues >

my advice is to focus on exploiting 2nd goal betting in early goal games = opening goal by home team 0-10 and opening goal by away team 0-20

and backing a further goal(s) laying the correct score in early goal games > on 80 +

I look forward to being a guest in the future and seeing someone apply in play analysis in reel time >

i did it but Dallas kept depleting what was very strong in play analysis >

I think we can all agree which is a step forward > is that Soccer mystic could do with some updating :idea:
Sovereign
Posts: 39
Joined: Wed May 10, 2017 3:12 pm

I don't like to speak out, but I find the posts by jonnyg really infuriating. This thread was posted by someone willing to help the community, in a manner that is easily accessible, and yet its decended into another 'look at me' thread for jonnyg's ego.

Jonnyg, your condescending attitude is offputting at best, rude at worst. Your posts read like copy and pastes from your blog (which is because they are in many instances) and you seem to have a hell of a chip on your shoulder whenever you are confronted, or if anyone dares to disagree. Congratulations, you've been doing this for years. Well so have a lot of other people on here, and they do so in a manner that helps the community without endless spam posts and by targetting others who disagree in the process.
deansaccount
Posts: 120
Joined: Mon May 30, 2016 5:19 pm

Sovereign wrote:
Sun Aug 13, 2017 10:33 pm
I don't like to speak out, but I find the posts by jonnyg really infuriating. This thread was posted by someone willing to help the community, in a manner that is easily accessible, and yet its decended into another 'look at me' thread for jonnyg's ego.

Jonnyg, your condescending attitude is offputting at best, rude at worst. Your posts read like copy and pastes from your blog (which is because they are in many instances) and you seem to have a hell of a chip on your shoulder whenever you are confronted, or if anyone dares to disagree. Congratulations, you've been doing this for years. Well so have a lot of other people on here, and they do so in a manner that helps the community without endless spam posts and by targetting others who disagree in the process.
Which blog does he copy and paste from?
User avatar
Westerner
Posts: 161
Joined: Fri Apr 17, 2009 10:03 am

jonnyg wrote:
Sun Aug 13, 2017 4:19 pm
welshboy06 wrote:
Sun Aug 13, 2017 4:13 pm
jonnyg wrote:
Sun Aug 13, 2017 3:51 pm
your 84 game chap has provided the wrong data > :!:

although pleasing to see people looking at the effect of the early goal
I admit that there was an error in my code which didn't pick up all of the games. That was mainly due to me writing messy code, just to get the number you asked for.

Please don't spam this thread with your lists and lists of data. Thats not why I created it.
Also there's no need to take issue with someone who's trying to help you.
It is nice to see a chap = me who has spent over 5 years looking at the early goal inspiring people to do the same > I have the largest early goal data bank but people dont want me to share it :!:

With respect Jonny, how do you know you have the largest? When I did this to create my app 5 years ago it took a lot of hard work to gather goal data (times of all goals) and match with the 40,000+ games with match stats from the top leagues.

I didn't have time to keep on top of the project so it's nice to see others doing the same although it does bug me that I seem to have what quite a few people need and I'm not making the most of it.
welshboy06
Posts: 165
Joined: Wed Mar 01, 2017 2:06 pm

Westerner wrote:
Mon Aug 14, 2017 2:43 pm
jonnyg wrote:
Sun Aug 13, 2017 4:19 pm
welshboy06 wrote:
Sun Aug 13, 2017 4:13 pm


I admit that there was an error in my code which didn't pick up all of the games. That was mainly due to me writing messy code, just to get the number you asked for.

Please don't spam this thread with your lists and lists of data. Thats not why I created it.
Also there's no need to take issue with someone who's trying to help you.
It is nice to see a chap = me who has spent over 5 years looking at the early goal inspiring people to do the same > I have the largest early goal data bank but people dont want me to share it :!:

With respect Jonny, how do you know you have the largest? When I did this to create my app 5 years ago it took a lot of hard work to gather goal data (times of all goals) and match with the 40,000+ games with match stats from the top leagues.

I didn't have time to keep on top of the project so it's nice to see others doing the same although it does bug me that I seem to have what quite a few people need and I'm not making the most of it.
I suppose you could always make your data available too? It would never hurt to have too much data :)

Cheers,
Adam
Sovereign
Posts: 39
Joined: Wed May 10, 2017 3:12 pm

deansaccount wrote:
Mon Aug 14, 2017 9:42 am
Sovereign wrote:
Sun Aug 13, 2017 10:33 pm
I don't like to speak out, but I find the posts by jonnyg really infuriating. This thread was posted by someone willing to help the community, in a manner that is easily accessible, and yet its decended into another 'look at me' thread for jonnyg's ego.

Jonnyg, your condescending attitude is offputting at best, rude at worst. Your posts read like copy and pastes from your blog (which is because they are in many instances) and you seem to have a hell of a chip on your shoulder whenever you are confronted, or if anyone dares to disagree. Congratulations, you've been doing this for years. Well so have a lot of other people on here, and they do so in a manner that helps the community without endless spam posts and by targetting others who disagree in the process.
Which blog does he copy and paste from?
I'd rather not provide him with the extra publicity, but if you copy and paste some of the starts of his posts (the one that sound like they've been copy and pasted) you can find them word for word elsewhere.
Sovereign
Posts: 39
Joined: Wed May 10, 2017 3:12 pm

As jonnyg has objected to me privately after saying that many of his posts are copy and paste posts as it might be assumed I mean from other sources, I'm happy to correct the matter.

I meant that he copy and pastes posts from his own blog, not from others. It is his content (as far as I am able to deduce).
User avatar
Westerner
Posts: 161
Joined: Fri Apr 17, 2009 10:03 am

welshboy06 wrote:
Mon Aug 14, 2017 3:53 pm
Westerner wrote:
Mon Aug 14, 2017 2:43 pm
jonnyg wrote:
Sun Aug 13, 2017 4:19 pm


It is nice to see a chap = me who has spent over 5 years looking at the early goal inspiring people to do the same > I have the largest early goal data bank but people dont want me to share it :!:

With respect Jonny, how do you know you have the largest? When I did this to create my app 5 years ago it took a lot of hard work to gather goal data (times of all goals) and match with the 40,000+ games with match stats from the top leagues.

I didn't have time to keep on top of the project so it's nice to see others doing the same although it does bug me that I seem to have what quite a few people need and I'm not making the most of it.
I suppose you could always make your data available too? It would never hurt to have too much data :)

Cheers,
Adam
Cheers Adam, but problem I have is I ended up selling 50% of the app so technically I now only own half the data. So frustrating. Just haven't got the time to analyse it in more detail.
Post Reply

Return to “Betfair Data”