
In Progress
Posted
Paid on delivery
Budget: $1,800 USD (Fixed Price) Timeline: 7-10 days Tech Stack: Python 3.10+, Playwright (Async), SQLite, Ubuntu VPS Skills Required: Python, Playwright, Web Scraping, Async/Await, Proxies, SQLite ═══════════════════════════════════════════════════════ PROJECT DESCRIPTION ═══════════════════════════════════════════════════════ I need 4 production-ready, high-performance web scrapers for auto parts websites and auction history. The goal is to build a robust data pipeline that runs autonomously on an Ubuntu VPS using Bright Data residential proxies. WEBSITES TO SCRAPE: 1. [login to view URL](Wholesale parts - Login required) 2. [login to view URL] (Wholesale parts - Login required) 3. [login to view URL] (Auction history - Public) 4. [login to view URL] (Retail catalog - Public - Complex navigation) ═══════════════════════════════════════════════════════ TECHNICAL REQUIREMENTS (Non-Negotiable) ═══════════════════════════════════════════════════════ 1. ASYNC/AWAIT ARCHITECTURE - Must use Python asyncio + Playwright - NO Selenium allowed - Clean, maintainable async code 2. CONCURRENCY - Handle 10-30 concurrent browser contexts efficiently - Proper resource management (no memory leaks) - Configurable concurrency limits 3. BANDWIDTH OPTIMIZATION (CRITICAL) - Block images, fonts, CSS, videos, media files using [login to view URL]() - Target: Under 300KB per page load (vs 2-5MB unoptimized) - This directly reduces Bright Data proxy costs by ~80% - Log bandwidth usage per 100 pages for verification 4. DATA INTEGRITY (CRITICAL) - NO direct CSV writing during scraping - Must save to local SQLite database first (prevents data loss on crash) - Database structure: * Each scraper has own database: [login to view URL], [login to view URL], [login to view URL], [login to view URL] * Tables include: id (autoincrement), scraped_at (timestamp), all data fields * Checkpoint table: tracks progress (last_vehicle_id, last_page, etc.) - Separate export script: python [login to view URL] to convert SQLite to CSV on demand 5. RESILIENCE - Checkpoint System: If script stops at record 5,000 of 10,000, must resume exactly there - Retry Logic: Auto-retry on 500/403 errors or timeouts - Graceful shutdown: SIGTERM should save checkpoint before exit 6. INFRASTRUCTURE - Must run headless on Ubuntu 24.04 VPS - Bright Data proxy integration (credentials provided) - Configurable via .env file - Production-ready error logging ═══════════════════════════════════════════════════════ SPECIFIC CHALLENGES & REQUIRED SOLUTIONS ═══════════════════════════════════════════════════════ 1. [login to view URL] (The Beast) Challenge: - Complex tree navigation system - "Soft Blocks" - infinite loading bars, empty results after N requests - Aggressive bot detection Required Solution: - Implement browser fingerprinting stealth techniques (playwright-stealth or similar) - Handle dynamic category tree expansion efficiently - Detect and handle soft blocks (wait, retry with new session) - Must NOT open thousands of tabs (memory explosion) 2. PARTSMAX / PRIMEROAUTOPARTS Challenge: - Pricing requires hover interactions or variant selection - Distinguish "List Price" vs "Your Price" (member pricing) Required Solution: - Accurate hover-based data extraction - Handle missing prices gracefully - Extract stock availability 3. BIDFAX Challenge: - High volume pagination (~500,000 records) - Potential rate limiting Required Solution: - Efficient pagination without timeouts - Checkpoint every N pages - Handle network interruptions gracefully ═══════════════════════════════════════════════════════ DELIVERABLES (Per Scraper) ═══════════════════════════════════════════════════════ For EACH of the 4 scrapers: 1. [login to view URL] - Main asynchronous scraping script 2. [login to view URL] - SQLite schema definitions 3. [login to view URL] - Centralized configuration (concurrency, timeouts, retries) 4. [login to view URL] - SQLite to CSV converter with filters (date range, vehicle type, etc.) 5. [login to view URL] - Quick script to check current scraping progress 6. [login to view URL] - All Python dependencies 7. [login to view URL] - Step-by-step setup guide specific to this scraper PLUS Global Deliverables: 8. [login to view URL] - Bash script to install Python/Playwright/dependencies on fresh Ubuntu VPS 9. [login to view URL] - Template for proxy credentials and global settings 10. Video Walkthrough - 15-20 minute Loom/screen recording explaining: - Code architecture - How to deploy on VPS - How to run each scraper - How to monitor progress - How to handle common errors 11. GitHub Repository: I will create a private GitHub repository and add you as a collaborator. You must push all code directly to the main or dev branch of my repository. This is a requirement for milestone payments ═══════════════════════════════════════════════════════ DATA EXTRACTION REQUIREMENTS ═══════════════════════════════════════════════════════ PARTSMAX Output (SQLite then CSV): year, make, model, description, part_number, your_price, stock, list_price, scraped_at PRIMEROAUTOPARTS Output: Similar structure to PartsMax but with image_url. BIDFAX Output: final_bid, auction, lot_number, sale_date, sale_location, vin, make, model, year, Documents_title, Seller, Primary_Damage, Secondary_Damage, odometer, condition, Estimated_Retail_Value, Transmission, Keys, Fuel, drive, scraped_at ROCKAUTO Output: year, make, model, engine, category, part_type, manufacturer, part_number, price, scraped_at ═══════════════════════════════════════════════════════ PAYMENT MILESTONES ═══════════════════════════════════════════════════════ Milestone 1 (40% - $720) - Day 3-4: - PartsMax + primeroautoparts completed and tested - Both tested with 1,000+ records each - SQLite implementation verified - Bandwidth optimization confirmed ([login to view URL] working) - Code in GitHub repository Milestone 2 (40% - $720) - Day 7-8: - BidFax + RockAuto completed and tested - RockAuto soft-block handling verified with 500+ requests - All 4 scrapers deployed and running on VPS - Checkpoint/resume tested (kill process, restart, verify continuation) Milestone 3 (20% - $360) - Day 10: - All documentation complete - Video walkthrough delivered - Final stress test passed (run all 4 scrapers for 2+ hours) - 7-day support period begins ═══════════════════════════════════════════════════════ WHAT I PROVIDE ═══════════════════════════════════════════════════════ - Dedicated VPS: I will provide a clean Ubuntu 24.04 LTS VPS (DigitalOcean) with 4GB RAM / 2 CPUs. I will provide access via your SSH Public Key. - Bright Data residential proxy credentials (high-quality proxies) - Login credentials for PartsMax and primeroautoparts - Sample vehicle combinations list (CSV with Year/Make/Model) - Quick response time for questions (I'm technical, no hand-holding needed) - Existing reference code (Selenium-based, can provide as context) ═══════════════════════════════════════════════════════ SELECTION CRITERIA ═══════════════════════════════════════════════════════ You MUST have: - Portfolio with 5+ web scrapers (Playwright strongly preferred) - Experience with async Python and concurrent programming - Proxy integration experience (Bright Data, Oxylabs, or similar) - 90%+ job success rate on Freelancer - Ability to start within 24 hours I will immediately reject bids that: - Don't answer ALL screening questions below - Propose using Selenium instead of Playwright - Have no portfolio or generic "I can do this" responses - Bid under $1,200 (indicates you don't understand complexity) - Bid over $2,500 (overpriced for this scope) ═══════════════════════════════════════════════════════ SCREENING QUESTIONS (Must Answer All) ═══════════════════════════════════════════════════════ 1. What specific Playwright method/approach will you use to block images and media to save proxy bandwidth? (Be specific - code snippet preferred) 2. How do you handle RockAuto's "soft blocks" or infinite loading states? What's your detection and recovery strategy? 3. Have you scraped PartsMax, primeroautoParts, or similar wholesale auto parts portals before? If yes, which ones? 4. Describe your exact approach for SQLite checkpoint/resume. What happens if the script crashes at record 5,432 of 10,000? 5. How many concurrent Playwright contexts can you safely run on a 4GB RAM VPS without memory issues? 6. Share a link to your best async web scraper (GitHub or portfolio). What was the scale? (Records scraped, pages/hour, etc.) 7. Are you available to start immediately and deliver in 7-10 days? ═══════════════════════════════════════════════════════ REQUIREMENTS & PROFILE ═══════════════════════════════════════════════════════ I am a technical founder (Licensed Dealer). This is Phase 1 of a larger infrastructure (8+ future scrapers planned). MANDATORY: Async Playwright Only (No Selenium). Must deploy on Ubuntu VPS. Must deliver in 7-10 days. ═══════════════════════════════════════════════════════ TO APPLY (Must Include) ═══════════════════════════════════════════════════════ Answer all 7 screening questions. Link to GitHub repos showing large-scale Async Playwright scrapers. Confirm 24h start and $1,800/10-day terms. Suggest one technical improvement to my approach.
Project ID: 40189708
148 proposals
Remote project
Active 4 mos ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs

Hello! These scrapers usually fail from heavy bandwidth, weak resume logic, and RockAuto soft blocks, so I will build them lean, resumable, and self healing. Async Playwright only with asyncio, 10 to 30 concurrency limits configurable per site Bandwidth saving: route abort for images fonts css media and log KB per 100 pages SQLite first: one db per site plus checkpoint table so crashes resume exactly where they stopped Resilience: retries on 403 500 timeouts, SIGTERM saves checkpoint, clean structured logs RockAuto: single tab tree nav, stealth, detect infinite loaders, rotate session and backoff on soft blocks PartsMax and Primero: login flow, hover or variant pricing, capture list price vs your price and stock Deliverables: [login to view URL] [login to view URL] [login to view URL] [login to view URL] checkpoint viewer, [login to view URL] env template, README, and a walkthrough video in your GitHub I can start within 24 hours and deliver within your 1800 budget window. Warm regards, Yulius Mayoru
$1,500 USD in 7 days
5.4
5.4
148 freelancers are bidding on average $2,057 USD for this job

Hello, your project "4 Async Web Scrapers (Auto Parts) - Python/Playwright/BrightData" aligns perfectly with my skills in PHP, Python, Web Scraping, Software Architecture, and MySQL. The budget can be adjusted after a detailed discussion of the project scope. My focus is to deliver quality within your budget. With 15 years of experience, customer satisfaction is my top priority. I am eager to start and showcase my commitment to this project. Looking forward to discussing the job details with you. Thank you.
$2,100 USD in 21 days
8.7
8.7

⭐⭐⭐⭐⭐ Build 4 High-Performance Web Scrapers for Auto Parts Websites ❇️ Hi My Friend, I hope you're doing well. I've reviewed your project requirements and see you're looking for high-performance web scrapers for auto parts websites. Look no further; Zohaib is here to help you! My team has completed over 50 similar projects for web scraping. I will create robust scrapers using Python and Playwright to ensure efficient data extraction while maintaining data integrity and performance. ➡️ Why Me? I can easily do your web scraping project as I have 5 years of experience in Python, Playwright, and asynchronous programming. My expertise includes web scraping, data management, and error handling. Additionally, I have a strong grip on database integration and proxy management to ensure smooth operation. ➡️ Let's have a quick chat to discuss your project in detail. I can show you samples of my previous work that demonstrate my skills in creating efficient web scrapers. Looking forward to discussing this with you in our chat. ➡️ Skills & Experience: ✅ Python Programming ✅ Playwright Framework ✅ Web Scraping ✅ Async/Await ✅ SQLite Database ✅ Proxy Integration ✅ Data Integrity ✅ Error Handling ✅ Resource Management ✅ Bandwidth Optimization ✅ Concurrency Control ✅ Documentation Writing Waiting for your response! Best Regards, Zohaib
$1,800 USD in 2 days
8.1
8.1

Hi Carlos A., ➡️ I read your project description and understand you need 4 high-performance web scrapers for auto parts websites and auction history, using Python/Playwright and integrating with a SQLite database on an Ubuntu VPS. ⏺️ With over 12 years of experience as a Full Stack Developer, I specialize in Python and async programming. My portfolio includes extensive work with Playwright and proxy management, ensuring efficient and robust scraper designs. I have developed similar web scrapers that effectively handle complex site navigation, dynamic content, and concurrency, while ensuring data integrity and minimizing bandwidth usage through strategic resource management. Regards, Aftab Ahmad Full Stack Developer (12 Years of experience)
$3,000 USD in 30 days
7.4
7.4

Hi there, I’m excited about the opportunity to create four high-performance web scrapers specifically tailored for the auto parts industry. As a top freelancer from California with 5-star reviews, I understand the complexities involved in scraping wholesale and retail websites, particularly with the need for precision and efficiency in challenging environments. With extensive experience in Python, Async/Await architecture, and Playwright, I can assure you that your requirements for concurrency, bandwidth optimization, and data integrity will be met. I have developed similar projects before, allowing for resilient scraping with robust error handling and checkpoint systems, especially addressing the specific challenges of scraping sites like RockAuto and PartsMax. I would love to discuss how I can tailor these scrapers to your needs and start as soon as possible. Let’s connect to finalize the details and get started! What specific metrics are you looking to analyze from the scraped data, and how do you envision using this data in your larger infrastructure? Thanks,
$2,750 USD in 1 day
6.7
6.7

I'm experienced in developing high-performance web scrapers using Python and Playwright, with a focus on asynchronous architecture. I have a strong background in web scraping, async/await, proxies, and SQLite database management. My expertise aligns perfectly with the technical requirements outlined in the project description. I am confident in my ability to deliver robust and efficient scrapers for the specified auto parts websites within the given timeline.
$3,000 USD in 7 days
6.7
6.7

Hi there, hope you are doing well I checked your project descripton carefully, I can complete this project perfectly in your needed timeline because I am super experienced in webscrapping and backend development ping me please Thanks
$2,000 USD in 10 days
6.9
6.9

With my solid background in software engineering and my deep understanding of Python and Linux systems, I believe I am the right person for this project. I have well-rounded experience in web scraping within a variety of contexts and understand the unique challenges it can present. For instance, in one of my past projects, we faced the challenge of handling rate limits during high-volume pagination much like the one on Bidfax. My team and I came up with a solution that allowed for efficient pagination without timeouts, checkpointing every N pages to save progress as you rightfully require. Furthermore, I have extensive experience using Python asyncio and Playwright, which are crucial to achieving your project's concurrency and bandwidth optimization goals. In a recent project, we were tasked with building high-performance web scrapers that could efficiently handle concurrent browser instances just as you require. We successfully accomplished this by leveraging our grasp of async programming, employing proper resource management techniques, and configuring manageable concurrency limits.
$3,000 USD in 30 days
7.0
7.0

Hello Carlos A., I checked your project, and it looks interesting. This is something we already work on, so the requirements are clear from the start. We mainly work on PHP, Python, Web Scraping, Software Architecture, MySQL, Node.js, Web Development, Backend Development, Database Management, API Development We focus on making things simple, reliable, and actually useful in real life not overcomplicated stuff. Let’s connect in chat and see if we’re a good fit for this. Best Regards, Ali nawaz
$1,500 USD in 15 days
6.3
6.3

Hello, I understand you're looking for four production-ready web scrapers for auto parts and auction history, aiming for high performance and robust data integrity. My experience with async web scraping using Python and Playwright makes me a strong candidate for this project. In previous projects, I successfully developed similar scrapers that efficiently handled complex sites, ensuring data accuracy and optimal bandwidth usage. ✅My Plan: - Utilize Python asyncio and Playwright for a clean, maintainable codebase. - Implement concurrency to manage multiple browser contexts without memory leaks. - Optimize bandwidth by aborting unnecessary media, targeting under 300KB per page load. - Design a resilient checkpointing system to ensure data integrity, enabling seamless restarts. Could you clarify the preferred method for handling any IP bans or rate limits during scraping? Additionally, how flexible are you with the timeline if challenges arise? Best regards, Hongqiang Chen
$1,800 USD in 12 days
5.5
5.5

https://www.freelancer.com/projects/data-scraping/Automated-Counterfeit-Detection/reviews Dear. Nice to meet you. I am very pleasure to submit my proposal on your scrapping and automation project. I have many experiences in these field using python. Recently, I developed Automated Counterfeit Detection and Reporting System on Amazon. You can check this in my portfolio. I am sure and I can start immediately. I will wait for your good news. Thank you.
$1,500 USD in 7 days
5.6
5.6

Hello, I’m excited about the opportunity to help scrapers those site, I can complete this project perfectly when you hire me. Best regards, Juan
$1,800 USD in 7 days
5.6
5.6

Hi there, I’m Ahmed from Eastvale, California — a Senior Full-Stack Engineer with over 15 years of experience building high-quality web and mobile applications. After reviewing your job posting, I’m confident that my background and skill set make me an excellent fit for your project — 4 Async Web Scrapers (Auto Parts) - Python/Playwright/BrightData . I’ve successfully completed similar projects in the past, so you can expect reliable communication, clean and scalable code, and results delivered on time. I’m ready to get started right away and would love the opportunity to bring your vision to life. Looking forward to working with you. Best regards, Ahmed Hassan
$2,500 USD in 1 day
5.2
5.2

Hello! I understand that you're looking for four high-performance web scrapers specifically designed for auto parts websites, to streamline data extraction and build a robust data pipeline on an Ubuntu VPS. I will employ Python with the Playwright async framework to build efficient, maintainable scrapers that can handle concurrent sessions while optimizing bandwidth usage to reduce proxy costs. The code will be thoroughly documented, and I'll ensure resilience through a checkpoint system. For examples of my previous work, please check my profile. Regards, Davide
$1,800 USD in 20 days
5.2
5.2

Dear Client, I have carefully reviewed your project requirements for developing 4 high-performance web scrapers for auto parts websites and auction history. With over 8 years of experience in Python and Playwright, I am confident in my ability to deliver a robust solution that meets your specifications. My approach will involve utilizing Python's asyncio along with Playwright to ensure efficient asynchronous scraping. I will implement strategies to handle concurrency, optimize bandwidth usage, maintain data integrity using SQLite databases, and ensure resilience through checkpoint systems and retry logic. I am keen to discuss your project further and share my detailed plan for its successful execution. I am available to start immediately and commit to delivering within the specified timeline. I look forward to the opportunity to collaborate with you on this exciting project. Best regards, Toma K.
$1,500 USD in 7 days
5.4
5.4

Hello, I've noticed the intricate challenges posed by the different auto parts websites you need scraped, especially the complex navigation system of RockAuto.com. I will tackle this by implementing advanced browser fingerprinting techniques and efficient handling of dynamic category tree expansions. My experience in developing high-performance web scrapers using Python, Playwright, and SQLite includes a similar project where I successfully extracted data from multiple sources concurrently and stored it securely. Happy to jump on this. – Justin
$2,500 USD in 7 days
4.8
4.8

⭐️⭐️⭐️⭐️⭐️ Hello, I appreciate the opportunity to collaborate on this project! I assure you that I have the expertise in Python, Playwright, and web scraping to create the 4 high-performance scrapers you need. My commitment to client satisfaction means I will ensure each scraper meets your requirements for efficiency, resilience, and data integrity. My approach includes developing asynchronous scripts with precise error handling and robust logging on your Ubuntu VPS. I have experience with Bright Data proxies and can efficiently manage concurrency while optimizing bandwidth. With a strong portfolio of async web scrapers, I'm ready to start within 24 hours and deliver within 10 days. Let's create a seamless data pipeline together! Looking forward to your response. Abdulhamid.
$1,500 USD in 7 days
4.9
4.9

Hi, I am enthusiastic about your project for building 4 high-performance web scrapers tailored for auto parts websites. With over 7 years of experience in web scraping and a strong command of Python and Playwright, I have successfully implemented async solutions that handle complex data structures efficiently. I specialize in bandwidth optimization, concurrency management, and data integrity to ensure each scraper operates seamlessly on your Ubuntu VPS with Bright Data proxy integration. I can deliver all components within your required timeline of 7-10 days. What specific concerns do you have regarding the implementation of the scrape for RockAuto? Best regards, Andrii
$2,500 USD in 10 days
4.5
4.5

Hi, I am eager to assist you with building the four async web scrapers for auto parts websites. With over 10 years of experience in Python and expertise in Playwright and web scraping, I am confident in delivering high-performance solutions tailored to your requirements. I will ensure that my scrapers are production-ready, efficiently handling concurrency while optimizing bandwidth and maintaining data integrity through SQLite. My development process includes comprehensive error handling and a robust checkpoint system to guarantee scrapers can resume execution without data loss. I can start within 24 hours and deliver the complete solution within 10 days. Let’s connect to discuss any specific initial requirements or clarify project details further.
$2,500 USD in 5 days
4.7
4.7

Hi There!!! The Goal of the project:- Build four production ready async Playwright web scrapers with strong resilience, bandwidth optimization, and SQLite based data integrity running autonomously on an Ubuntu VPS. I have carefully read the complete project description and understand the non negotiable async Playwright architecture, Bright Data proxy usage, aggressive bandwidth blocking, checkpoint resume logic, SQLite first storage, complex RockAuto anti bot handling, login based wholesale portals, and strict deployment and documentation requirements within a 7 to 10 day timeline. I am the best fit because I have deep experience delivering large scale async scraping systems that are stable, efficient, and production hardened. Async Playwright scrapers with concurrency control, media blocking, and proxy rotation SQLite based checkpoint and resume system with crash safe data persistence Ubuntu VPS deployment, logging, testing, and full GitHub based delivery I provide UI design where needed, database management, full testing, deployment support, documentation, and full source code delivery at project completion. I bring 9+ years experience as a full stack developer and have already completed high volume async Playwright scrapers for ecommerce catalogs, auction data, and authenticated B2B portals using residential proxies. Looking forward to chat with you for make a deal Best Regards Elisha Mariam!
$2,899 USD in 36 days
4.7
4.7

Hi, hope you are doing well. I've read your proposal very carefully, and I am confident about your project. I understand that you need 4 production-ready, high-performance web scrapers for auto parts websites and auction history, designed to run autonomously on an Ubuntu VPS using Bright Data residential proxies. I have hands-on experience in developing complex web scrapers utilizing Python 3.10+, Playwright (Async), and SQLite, ensuring they are efficient and resilient against challenges like bot detection and network limitations. These are my approach: - Implement async/await architecture to optimize performance and resource management. - Utilize browser fingerprinting techniques to navigate complex sites like RockAuto without triggering soft blocks. - Develop a robust checkpoint system to ensure data integrity and seamless recovery from interruptions. I can start immediately and complete the work within a short timeline of 7-10 days. I'm excited about the opportunity to bring your project to life while ensuring it's efficient and scalable. Looking forward to your reply!
$1,500 USD in 7 days
4.8
4.8

DORAL, United States
Payment method verified
Member since May 30, 2025
$5000-10000 USD
$8-15 USD / hour
$30-250 USD
$2-8 USD / hour
$1500-3000 USD
₹12500-37500 INR
$15-25 USD / hour
₹3000-4000 INR
₹12500-37500 INR
₹600-1500 INR
$30-250 USD
$10-30 USD
₹1500-12500 INR
₹600-650 INR
$8-15 USD / hour
₹1500-12500 INR
₹37500-75000 INR
€8-100 EUR / hour
$250-750 AUD
$30-250 USD
₹12500-37500 INR
€8-30 EUR