Đã Đóng

Java scrapping improve expert

I developed a Java program to scrap information from a website. The architecture of the solution involves: 1) using Java Selenium to send requests to the webpage via Chrome Webdriver to trigger authentication and authenticated requests; 2) routing the requests from Chrome (headless) to Java BrowserMobProxy to capture three HTTP headers (Authorization, X-CSRF-TOKEN, and Cookie) and one query string (without these, the server after some requests starts responding 512); and 3) use these 4 elements in HTTPs requests from Java directly to the webpage (i.e. without Selenium, Chrome, and BrowserMobProxy involved) to retrieve the desired information.

This program does the basic functionality of extracting the information but has a few problems:

It depends on an external non-Java component: Chrome WebDriver

It depends on Java Selenium and Java BrowserMobProxy, two dependencies that I would like to remove

It is not optimized (too much refresh and too long sleep periods) relatively to the limit upon which the Webpage (Cloudfare) starts responding 429 errors. Thus, the retrieval of the information is taking much more time than needed.


You will get the current program Java code and you will need to solve the problems above. To do so, you will need to:

A. Find out how to authenticate and refresh the 3 headers and the query string without depending on Selenium, Chrome Webdriver, and BrowserMobProxy. As most of this data is likely generated in JavaScript, you will need knowledge about JavaScript and how to execute JavaScript from within Java or convert the JavaScript code to Java (preferable solution).

B. You will need to identify the limit upon which the Webpage (behind Cloudfare) starts responding 429 errors. You will need to tune the refresh frequency of the headers and sleep periods to the limit identified. You will need to demonstrate the benefits of your changes by extracting the information currently extracted by the program and measuring how long it takes.

Note: you will need to create your own login/password in the webpage. No additional requirements exist to register.

Kĩ năng: Java, Web Scraping, JavaScript, Python, Kiến trúc phần mềm

Xem nhiều hơn: java samples improve website look, java script designing form register website, expert working time, web scraping java vs python, scraper library java, jsoup, web scraping java source code, how to do screen scraping in java, java web scraping handbook pdf, java web scraping package, beautifulsoup for java, java countdown clock based server time, java developer job ofbiz full time, excel expert part time job, expert advisor time frame value, improve sql query time oscommerce, java codes sql cash register, java socket code server login time clients, perl java scrapping, recent java projects indian citizen part time

Về Bên Thuê:
( 1 Nhận xét ) Băilești, Romania

ID dự án: #26951665

12 freelancer chào giá trung bình$179 cho công việc này


Hi, Greetings! ✅checked your project details: Java scrapping improve expert ✅Completed Time: In project deadline We have worked on 650 + Projects. I have 6 + years of the experience in same kind of projects. If Thêm

$240 USD trong 5 ngày
(128 Nhận xét)

I can replace existing scrape with python requests library with removing headache of jars and selenium and chrome driver. it will be very light weight. to resolve 429 errors we need to use proxy or delay. These are my Thêm

$167 USD trong 2 ngày
(55 Nhận xét)

Hi, Manger! Here is a Web Scraping expert. I have experience web scrapping and auto send mail app. So I think it's best for this project. I can work 24/7. if you require, I can work 60hours per week. if you hire me, y Thêm

$200 USD trong 3 ngày
(29 Nhận xét)

Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp Thêm

$250 USD trong 7 ngày
(25 Nhận xét)

Hi there, Let’s have a quick chat to discuss this project. I am expert in Python, PHP, JavaScript,Web Scraping,MYSQL.I do have expertise for this project. You can check my portfolio here:- https://www.freelancer.com/ Thêm

$250 USD trong 3 ngày
(18 Nhận xét)

Hello, I am very interested in your project”Java scrapping improve expert”. Webscrapping is my best skill. I have read the job description and I am interested in this job. I have 8 years experience in developing produ Thêm

$140 USD trong 7 ngày
(14 Nhận xét)

Hii there , I am bidding on your project and I am good at this field I can do this for you within due time and honestly. I also have a few questions to discuss. Kindly contact me and we will discuss time and budge Thêm

$140 USD trong 7 ngày
(17 Nhận xét)

Hello, I hpoe your family safe with Covid-19 I am a Java Full Stack Developer with hands-on experience working on Various websites, applications for more than 5+ years. I have an expert development team. All the resou Thêm

$140 USD trong 7 ngày
(10 Nhận xét)

Dear Employer, Thanks for posting the project . I have gone through your description " need to create your own login/password in the webpage. No additional requirements exist to register." and I believe I'm capable to Thêm

$140 USD trong 7 ngày
(0 Nhận xét)

Hi! I am happy to put my bid on your project. I have read your requirement carefully and I am confident in this project. I am a skillful and experienced web developer, I have a tons of experience with JAVA/Python/MY Thêm

$140 USD trong 7 ngày
(0 Nhận xét)

Hi there, In order to scrap data from eTORO, may I suggest an alternative approach to yours? Selenium inherently has limitation and that is why your solution is unnecessarily complex. I suggest Puppeteer (https://dev Thêm

$200 USD trong 3 ngày
(0 Nhận xét)

Hi, I am a new freelancer. I have professional experience in [login to view URL], Jave script, C#, Web scrapiing, Data Mining, Pythton and Tableau. I am also a certified Charted Accountant. I can deliver high Quality project and Thêm

$140 USD trong 5 ngày
(1 Nhận xét)