
Open
Posted
•
Ends in 5 days
Paid on delivery
I need a reliable solution that automatically pulls public-facing text from specific social-media websites and delivers the content back to me in a clean, structured format ready for downstream analysis. The task covers three steps: building or customising a scraper in Python that navigates the chosen platforms, capturing posts, comments and any associated meta-information I specify; cleaning the raw output to remove emojis, markup, duplicate lines and other noise; and packaging the final, deduplicated dataset as a CSV or JSON file. Please write the code so I can rerun it anytime (command-line script or Jupyter notebook is fine) and include concise setup instructions plus brief inline documentation. I expect respectful rate-limit handling and compliance with each platform’s public-data policies. Acceptance will be based on: • Accurate capture of the requested text fields from the sample profile list I provide • Fewer than 1 % duplicate rows after cleaning • Script runs end-to-end on my machine with only standard Python libraries or clearly listed open-source dependencies If you already have experience scraping social platforms via Selenium, BeautifulSoup, Scrapy or similar tools, that would be ideal.
Project ID: 40385875
10 proposals
Open for bidding
Remote project
Active 56 yrs ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs