Python script for crawling API stops for some reason - make suggestions for improvement

  • Tình trạng: Closed
  • Giải thưởng: $20
  • Các bài thi đã nhận: 4
  • Người chiến thắng: marioada

Tóm tắt cuộc thi

Dear all,

We're using the below script for making requests with the crawling provider [login to view URL] (the documentation can be accessed here after having created a free account: [login to view URL]).

The script is working well in general, however with one problem remaining: It simply stops working from time to time - sometimes after having successfully crawled a couple of hundred, sometimes only after a couple of thousand URLs. But we can't get it stable to crawl a couple of 10k URLs.

Please make suggestions right in the code - including a comment that describes why you made the change. We'll then test it and award the amount if the change brings the desired result.

Looking forward to your contributions!

Các kĩ năng yêu cầu

Phản hồi của người thuê

“Mario is a great guy and a pleasure to work with!”

Hình ảnh hồ sơ thomasjohn6, Germany.

Những bài dự thi tốt nhất dự cuộc thi này

Xem thêm bài dự thi

Bảng thông báo công khai

  • imo581
    • cách đây 3 tuần

    I tried your scripts with some links. The API responds with status code 403 Forbidden. I tried to use the API using a browser and it gives me this message "Token is invalid or account is temporarily blocked! please login to your dashboard for more details". Is something wrong with your subscription?

    • cách đây 3 tuần
    1. thomasjohn6
      Chủ cuộc thi
      • cách đây 2 tuần

      Hello Islam, Thanks for your interest in the contest! I guess for somewhat obvious reasons, before posting the script in public, I removed the real token from the script :-)

      • cách đây 2 tuần
  • busygayan
    • cách đây 3 tuần

    Literally makes no sense for you to pay a third party service which costs you money, and their prices are pretty expensive.

    Why don't you create your own tiny system which can get this done ?
    It's nothing complicated.

    • cách đây 3 tuần
    1. busygayan
      • cách đây 3 tuần

      So 40 Bucks plus you need a sever which can handle 50K plain requests per an hour ? So to answer the question

      Proxy crawl cost - 2500 USD ( basic, not JavaScript )
      Custom approach cost - less than 400 USD ( with a 64GB / 16 vCPUs Server )

      Javascript based crawl on proxy crawl - $5,054.90
      Custom approach cost - less than 1000 USD ( 192 GB of ram , 32 vCPUs Server )

      Besides all that, the code is custom, its transparent and debugging is way easy.
      Your data is private.

      • cách đây 3 tuần
    2. busygayan
      • cách đây 3 tuần

      I have a bot which crawls facebook daily with over 1,000 concurrent accounts daily. custom coded using selenium with python and i make over 100 requests each second ( each request has its own unique IP / proxy ). Still i spend only around 2,000 on a monthly basis,

      This makes no sense and the customer is being technically ripped off, paying almost 5x the amount. Still the customer is stuck having to debug his own code, I'm not even going to go why the code fails. You could pay a couple of engineer a salary and have your own servers maintained with 0 issues for the amount that you spend on this company. even if you're doing this on a small scale, makes no sense.

      High cohesion is not bad at all, that's my point basically.
      Good Luck

      • cách đây 3 tuần
  • thomasjohn6
    Chủ cuộc thi
    • cách đây 3 tuần

    Thanks for your comment! However, for now we would like to use the convenience of such a provider. Maybe later do it on our own. So do you have any idea what the problem could be in the script? Thanks in advance!

    • cách đây 3 tuần

Xem thêm bình luận

Làm thế nào để bắt đầu với cuộc thi

  • Đăng cuộc thi của bạn

    Đăng cuộc thi của bạn Nhanh chóng và dễ dàng

  • Nhận được vô số bài dự thi

    Nhận được vô số Bài dự thi Từ khắp nơi trên thế giới

  • Trao giải cho bài thi xuất sắc nhất

    Trao giải cho bài thi xuất sắc nhất Download File - Đơn giản!

Đăng cuộc thi ngay hoặc tham gia với chúng tôi ngay hôm nay!