Predict Soccer Match Total Goals With Machine Learning
- Tình trạng: Closed
- Giải thưởng: $20
- Các bài thi đã nhận: 13
- Người chiến thắng: Gozienkwocha
Tóm tắt cuộc thi
Soccer is the world's most popular sport.
**This contest will test whether you're among the best Machine Learning engineers on Freelancer.com**
Your challenge is to use ML & Deep Learning to build a model that can best classify the TOTAL number of goals scored in a soccer match given publicly available data.
The data provided includes details on a team's recent performance, probability of winning, match location, date, recent performance against the opposing team & other recent info. In all, there are close to 100 input variables provided.
You can find a definition of each input variable here: http://bit.ly/Column_definitions
For each soccer match/ fixture:
If the total goals scored by both teams is greater than 2.5, its outcome is recorded as Over.
If the total goals scored is less than 2.5, its outcome is recorded as Under.
This data is recorded under each dataset’s last column called “outcome”.
A leaderboard of top 10 performing models will be posted daily on the contest's comments section.
The competition will run for 8 days.
A payout has been guaranteed & will be provided to the winner of the contest.
The data & other material:
There are 3 datasets provided (found in “Data CSVs.zip” zip file).
1. training_data.csv - This contains 100 000 matches & their outcomes that you will use to train your model(s).
2. validation_data.csv - This contains 50 000 matches & their outcomes that you will use to test/validate your model(s) performance.
3. testing_data.csv - This contains 500 matches (without outcomes) that you will need to predict with your model & submit their results as a list of 0 or 1 as part of your submission.
When predicting, if you predict less than 2.5 total goals, you will need to label that outcome as 0, if you predict more than 2.5 total goals, label that as 1.
4. A helper_script.ipynb python notebook has been provided. This script contains prebuilt functions that will help with data cleaning, encoding, imputing & model training. You may use this script to transform the data & train your model.
- The F1 Score (https://en.wikipedia.org/wiki/F1_score) will be used to determine your model's performance against other contestants.
- This F1 Score will be based on the predictions you make for the data in point 3 above (testing_data.csv).
For the leaderboard, F1 Scores will be rounded off to 3 decimal places.
- Should there be a tie, all of the top positioned contestants will each get the guaranteed payout.
- ** You may only post 2 submissions per day **
1. You are encouraged to use Python for model construction.
2. You may use any classification technique as you see fit (Deep Learning, Machine Learning)
Your submission must contain 3 things.
1. A list of your model's predictions for the first 250 matches on the testing_data.csv file. This must be posted as a comment under your submission. The comment must be of the form: First 250 entries: [0,1,0,1,0,0...,0,0]
2. A list of your model's predictions for the second 250 matches on the testing_data.csv file. This must be posted as a comment under your submission. The comment must be of the form: Second 250 entries: [0,1,0,1,0,0...,0,0]
3. A picture of your validation data F1 Score (calculated on 'validation_data.csv').
You are welcome to post any questions that you have on the contest's chat board.
Are you among the best of the best in Machine Learning?
PROVE IT by winning this contest.
Các kĩ năng yêu cầu
Phản hồi của người thuê
“Chigozie's solution was cutting edge, easy to understand & showed deep understanding of the problem. I would highly recommend him any Data Science/ Machine learning tasks & plan to work with him in future.”
LuyandaD, South Africa.