I have an immediate need for .NET developers to collect (scrape) real estate property and tax data from over 130 public websites in the US. We need to immediately add these sites to our existing server environment described below, with an additional 1,000 sites required immediately following the successful implementation of this project. Beyond that we anticipate to have many more contracts to follow based on our prior performance. The goal is to have the first 130+ sites coded within 30 to 45 days, with incremental deliverables along the way. We are inviting multiple individuals or teams to request an opportunity to participate. Each qualified individual/team that is selected will be awarded two sites to collect data from. Upon successful implementation, determined by the quality of data returned and the timeliness of completion, we will award additional contracts in incremental or bulk volume. The successful candidates that deliver the best performance will be invited to participate on future contracts.
The company I represent has an existing web application running on Windows Server 2012. The application accepts CSV and Excel XLS(X) files that contain search parameters and keys for input used by the scraping applications. These input files are uploaded by our client and converted to jobs in a database. When the client decides to start the saved job, the Job Manager begins executing as many multiple instances of scraper applications as needed, or possible, and instructs them as to which properties it needs to collect data for. The scraper applications are written as .NET service applications that support multi-threaded execution so that many instances of the same county scraping application can be running at the same time. Once the scraper application is done collecting its data it provides it back to the Job Manager as a CSV, XLS(X) or an XML file in the prescribed standard format. The data is then parsed and written into a SQL Server database by another service application and stored there until the client requests the output in CSV, XLS(X), XML or JSON formats, either through the web interface or through our custom API.
Our current scraping applications are written in Visual Basic .NET and Visual C# .NET, and these languages are our first preference. However, other language options will be considered if they are compatible with the .NET framework, can be executed as a service application and/or fit within the existing architecture with little or no modification. Candidates having experience working with any of the following web technologies will be considered first:
• Captchas • Flash
• Java • ASPX
• Flex • Silverlight
• Proxy IP rotation • Image download (.jpg, .png, .bmp, etc.)
• Direct API calls
If you are interested in this project please submit your proposal for development along with relevant experience. Please include which languages you are able to code in along with your years of experience for each, your availability to work on this effort in regards to starting date and number of hours a week you are available to work, which of the web technologies mentioned above you are experienced in working with, and whether or not you are working as an individual or have a team of developers that can develop multiple of these applications at the same time.
Please note that this is listed as an hourly project. We are required to specify a starting hourly range but are willing to compensate appropriately based on performance of the initial task. We will provide more detail about the sites that need to be scraped to all those who reply that have responded appropriately by establishing their relevant experience and qualifications. We will entertain a fixed price bid on future contracts after a candidate has demonstrated their successful implementation on this initial two site effort.
34 freelancer đang chào giá trung bình $30/giờ cho công việc này
Expert in Data Scraping ,Email Extracting , Website research , Database of any country and Category ,Payment after completed [login to view URL] for you reply . thank you