Predict Soccer Match Total Goals With Machine Learning

  • Tình trạng: Closed
  • Giải thưởng: $20
  • Các bài thi đã nhận: 13
  • Người chiến thắng: Gozienkwocha

Tóm tắt cuộc thi

Soccer is the world's most popular sport.
**This contest will test whether you're among the best Machine Learning engineers on Freelancer.com**
Your challenge is to use ML & Deep Learning to build a model that can best classify the TOTAL number of goals scored in a soccer match given publicly available data.

The data provided includes details on a team's recent performance, probability of winning, match location, date, recent performance against the opposing team & other recent info. In all, there are close to 100 input variables provided.
You can find a definition of each input variable here: http://bit.ly/Column_definitions

For each soccer match/ fixture:
If the total goals scored by both teams is greater than 2.5, its outcome is recorded as Over.
If the total goals scored is less than 2.5, its outcome is recorded as Under.
This data is recorded under each dataset’s last column called “outcome”.

A leaderboard of top 10 performing models will be posted daily on the contest's comments section.
The competition will run for 8 days.
A payout has been guaranteed & will be provided to the winner of the contest.

The data & other material:
There are 3 datasets provided (found in “Data CSVs.zip” zip file).
1. training_data.csv - This contains 100 000 matches & their outcomes that you will use to train your model(s).
2. validation_data.csv - This contains 50 000 matches & their outcomes that you will use to test/validate your model(s) performance.
3. testing_data.csv - This contains 500 matches (without outcomes) that you will need to predict with your model & submit their results as a list of 0 or 1 as part of your submission.
When predicting, if you predict less than 2.5 total goals, you will need to label that outcome as 0, if you predict more than 2.5 total goals, label that as 1.
4. A helper_script.ipynb python notebook has been provided. This script contains prebuilt functions that will help with data cleaning, encoding, imputing & model training. You may use this script to transform the data & train your model.

Performance Criteria:
- The F1 Score (https://en.wikipedia.org/wiki/F1_score) will be used to determine your model's performance against other contestants.
- This F1 Score will be based on the predictions you make for the data in point 3 above (testing_data.csv).
For the leaderboard, F1 Scores will be rounded off to 3 decimal places.
- Should there be a tie, all of the top positioned contestants will each get the guaranteed payout.
- ** You may only post 2 submissions per day **

Programming Language:
1. You are encouraged to use Python for model construction.
2. You may use any classification technique as you see fit (Deep Learning, Machine Learning)

Submission:
Your submission must contain 3 things.
1. A list of your model's predictions for the first 250 matches on the testing_data.csv file. This must be posted as a comment under your submission. The comment must be of the form: First 250 entries: [0,1,0,1,0,0...,0,0]
2. A list of your model's predictions for the second 250 matches on the testing_data.csv file. This must be posted as a comment under your submission. The comment must be of the form: Second 250 entries: [0,1,0,1,0,0...,0,0]
3. A picture of your validation data F1 Score (calculated on 'validation_data.csv').

You are welcome to post any questions that you have on the contest's chat board.

Are you among the best of the best in Machine Learning?
PROVE IT by winning this contest.

Các kĩ năng yêu cầu

Phản hồi của người thuê

“Chigozie's solution was cutting edge, easy to understand & showed deep understanding of the problem. I would highly recommend him any Data Science/ Machine learning tasks & plan to work with him in future.”

Hình ảnh hồ sơ LuyandaD, South Africa.

Bảng thông báo công khai

  • Gozienkwocha
    Gozienkwocha
    • cách đây 1 tháng

    Many thanks to the contest holder. It was really an enjoying time working on this project.

    • cách đây 1 tháng
    1. Gitesh98
      Gitesh98
      • cách đây 1 tháng

      Can you please share your file?

      • cách đây 1 tháng
    2. Gozienkwocha
      Gozienkwocha
      • cách đây 1 tháng

      Hi Gitesh, I would have shared my file but the contest holder hasn't given me the permission to do so. Have left a message for him in that regards and I'm yet to receive any response. I needed his permission because during the handover I signed an agreement to will all code rights to him, so if I share this file, I may be violating the agreement, hence I seek for his go-ahead before doing so. Once I get his green lights, I'll do well to share it with you. Thank you for your understanding.

      • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    I have verified the results in entry #15 .

    Without disclosing the contestant's exact methods, their solution involved the following:
    1. Data cleaning & removing of duplicates.
    2. Handling missing values & feature engineering on columns with dates.
    3. Using the median values to fill in missing data.
    3. Using 2 ensemble modules to model the outcome & blending the predictions from these to arrive at a final outcome.

    This entry has been awarded the contest's prize

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    The contest has now closed.
    A huge thank you to all contestants for participating.

    I will now evaluate the best performing entries & award the prize.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 12/03/2021:

    1. Entry #15 . F1_Score: 66.161
    2. Entry #11 . F1_Score: 66.048
    3. Entry #5 . F1_Score: 65.144
    4. Entry #7 . F1_Score: 64.588
    5. Entry #8 . F1_Score: 64.437
    6. Entry #3 . F1_Score: 63.889
    7. Entry #2 . F1_Score: 63.887
    8. Entry #10 . F1_Score: 63.809
    9. Entry #9 . F1_Score: 61.504
    10. Entry #4 . F1_Score: 60.429

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 11/03/2021:

    1. Entry #11 . F1_Score: 66.048
    2. Entry #5 . F1_Score: 65.144
    3. Entry #7 . F1_Score: 64.588
    4. Entry #8 . F1_Score: 64.437
    5. Entry #3 . F1_Score: 63.889
    6. Entry #2 . F1_Score: 63.887
    7. Entry #10 . F1_Score: 63.809
    8. Entry #9 . F1_Score: 61.504
    9. Entry #4 . F1_Score: 60.429
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Attention to all contestants

    1. I have now updated the leaderboard in a comment below.
    2. The contest will close in 18 hours, you may still submit entries until the contest has closed.
    3. Once the contest has closed, new entries will be evaluated & the leaderboard will be updated.
    4. The contestants with the top 3 entries will be asked in private chat to submit the notebooks used to generate their predictions.
    5. These notebooks will be used to reproduce results & verify that a winning entry has not been faked.
    6. The list of correct outcomes will be shared with you so that you may verify the results calculated for your own entry & that of others.

    7. Once a winning entry has been verified, the prize amount will be awarded.

    Thank you for your participation so far :)

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 10/03/2021:

    1. Entry #5 . F1_Score: 65.144
    2. Entry #7 . F1_Score: 64.588
    3. Entry #8 . F1_Score: 64.437
    4. Entry #3 . F1_Score: 63.889
    5. Entry #2 . F1_Score: 63.887
    6. Entry #10 . F1_Score: 63.809
    7. Entry #9 . F1_Score: 61.504
    8. Entry #4 . F1_Score: 60.429
    9.
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 09/03/2021:

    1. Entry #5 . F1_Score: 65.144
    2. Entry #7 . F1_Score: 64.588
    3. Entry #8 . F1_Score: 64.437
    4. Entry #3 . F1_Score: 63.889
    5. Entry #2 . F1_Score: 63.887
    6. Entry #9 . F1_Score: 61.504
    7. Entry #4 . F1_Score: 60.429
    8.
    9.
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 08/03/2021:

    1. Entry #5 . F1_Score: 65.144
    2. Entry #3 . F1_Score: 63.889
    3. Entry #2 . F1_Score: 63.887
    4. Entry #4 . F1_Score: 60.429
    5.
    6.
    7.
    8.
    9.
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 07/03/2021:

    1. Entry #3 . F1_Score: 63.889
    2. Entry #2 . F1_Score: 63.887
    3. Entry #4 . F1_Score: 60.429
    4.
    5.
    6.
    7.
    8.
    9.
    10.

    • cách đây 1 tháng
  • rawatpankaj9876
    rawatpankaj9876
    • cách đây 1 tháng

    In goal_home and goal_away column
    "negative" sign indicate what

    • cách đây 1 tháng
    1. LuyandaD
      Chủ cuộc thi
      • cách đây 1 tháng

      It expresses the maximum predicted goals that each team is expected to get.
      e.g. -3.5 means this team is expected to score 3 goals or less.

      • cách đây 1 tháng
  • Gozienkwocha
    Gozienkwocha
    • cách đây 1 tháng

    Hello. I would like to ask the contest holder if the values in the match winner have any significant meaning. Like if '1 N' means home team won, 'N 2' if away team won and so on. Or do they signify the score outcomes of the match? Like "1 N" mean the match eded in 1-0 in favour of home team, 'N 2' mean match ended in 0-2 in favour of home team and 1 and 2 mean that there was a draw. Thank you.

    • cách đây 1 tháng
    1. Gozienkwocha
      Gozienkwocha
      • cách đây 1 tháng

      Does it also imply that 1 and 2 mean outright win for home and away, respectively? I mean. 1 means the public are predicting that the home side will win and 2, the away side

      • cách đây 1 tháng
    2. LuyandaD
      Chủ cuộc thi
      • cách đây 1 tháng

      yes, 1 means home team is predicted to win.
      2 means away team is expected to wean outright.

      • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 06/03/2021:

    1. Entry #3 . F1_Score: 63.889
    2. Entry #2 . F1_Score: 63.887
    3.
    4.
    5.
    6.
    7.
    8.
    9.
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Edit to instructions:
    Please ignore this instruction: "When predicting, if you predict less than 2.5 total goals, you will need to label that outcome as 0, if you predict more than 2.5 total goals, label that as 1."

    If you use encoding on the outcome column, the value of Under will become 1 & the value of Over will become 0.
    Your entries will be graded according to this rule going forward.

    • cách đây 1 tháng
  • dataexpert18
    dataexpert18
    • cách đây 1 tháng

    #increaseprize #increaseprize #increaseprize #increaseprize

    • cách đây 1 tháng
    1. LuyandaD
      Chủ cuộc thi
      • cách đây 1 tháng

      Hi, which number is your entry?

      • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Leaderboard 05/03/2021:

    1. Entry #2 . F1_Score: 63.89
    2.
    3.
    4.
    5.
    6.
    7.
    8.
    9.
    10.

    • cách đây 1 tháng
  • LuyandaD
    Chủ cuộc thi
    • cách đây 1 tháng

    Hi there.
    I am the contest's holder.
    You are welcome to post any questions here.

    I will be updating the scoreboard once every day.
    I will also post the f1 scores of each entry in its comments section.
    Reminder, only 2 entries per contestant per day.

    • cách đây 1 tháng

Xem thêm bình luận

Làm thế nào để bắt đầu với cuộc thi

  • Đăng cuộc thi của bạn

    Đăng cuộc thi của bạn Nhanh chóng và dễ dàng

  • Nhận được vô số bài dự thi

    Nhận được vô số Bài dự thi Từ khắp nơi trên thế giới

  • Trao giải cho bài thi xuất sắc nhất

    Trao giải cho bài thi xuất sắc nhất Download File - Đơn giản!

Đăng cuộc thi ngay hoặc tham gia với chúng tôi ngay hôm nay!