Đã Đóng

automation app

Here is the outline of how the script will work:

For a given input file, see attached

Some key information is missing for certain records, such as telephone, company name, url, email address, and address. This script will attempt to find this data and output it. See attached output file. All input and output files will be in CSV format.

The script will use 14 ip addresses to send searches to google and bing.

A W9 will be required for the developer who gets selected.

Specifics of the logic:

1. take anybody that does not have a company name and do an address search.

a. Do the search with google, and take the title tag out of the top 10 results.

b. Do the same search on Bing and take the title tag from the top 10 results.

c. Pattern match the page titles and it should give a pretty unanimous company name

2. Take pattern matched company name, if company name was empty, if not then use the company name we already had. Take the company name and full address and google it. Street names are off on many of the examples. So we would strip out by removing 's, directions and street extensions:

This search gives us all kinds of results so we have to score these results:

a. go to the home page of each of the pages in the top 10 results.

Page analysis:

i. the name of the business should be located on this page

ii. the name of the business should be located in the title tag

b. If the name of the business is located on both, on page and in title then we can be pretty sure this is the website of the company.

c. If the company name is not on either go to the next site in the top 10 until we achieve the pattern match.

d. Once this sub routine is complete, now we have the URL of the company. See below for example of search. From this pdf in the results it would find the url of the business:

3. get email

a. Search for name, email and Url.

b. Use common business email address structures and do pattern matching for these anywhere in the resulting pages:

i. firstlastname @[url removed, login to view]

ii. [url removed, login to view] @[url removed, login to view]

iii. first_lastname @[url removed, login to view]

iv. First initial last name @[url removed, login to view]

If one of these structures are found, success, move on to #5. If it is unsuccessful move on to #4.

4. Email search:

a. As the regular search for email did not work, now we do the reverse:

b. This produces many searches looking for the right name. if a match is found the match is graded:

i. If the match is on the company website, +10

ii. If match is in a pdf, +5

iii. If match is in a PPT, +5

iv. If the match is both found by google and bing, +5

The match with the highest grade is the one the script will use.

5. Phone search

a. Google search for name, phone, actual email address.

b. This usually returns some type of result that has “phone:” on the page, from this we will parse the page pulling back all the digits to the right of the word “phone:”

Thanks for your time.

Kỹ năng: Perl

Xem thêm: work home success, w9 for, use data structures, type data structures, site find app developer, search structures, search app developer, search data structures, page reverse, matching logic, get developer app, find app website developer, get app developer, google app script examples, found app, find work app developer, find ppt, find one page developer, find google developer, examples data structures, developer app find, data structures examples, data structures example, company app developer, common data structures

Về Bên Thuê:
( 1 nhận xét ) Sacramento, United States

Mã Dự Án: #1680725

1 freelancer đang chào giá trung bình $500 cho công việc này


I can do it.

$500 USD trong 5 ngày
(3 Đánh Giá)

I can do it. Regards.

$750 USD trong 15 ngày
(0 Đánh Giá)