Đang Thực Hiện

134839 Content scraper bot

We would like you to develop scripts to two tab delimited text files created, where each row represents a record. Records being:

* file 1: quotes

* file 2: people who said the quotes.

Build these tables by extracting content from these sites. They're all large quotation sites:

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

[url removed, login to view]

Where available get these fields:

For quotes (note most of these fields won't be available on most sites. So it's not as much work as it looks.)

* quote (NOT NULL)

* author of the quote

* when it was said

* where it was said (eg Lincoln Memorial, The Moon)

* type of source where it was said (eg Movies, Literature)

* from what book it was said

* source site (don't worry about merging data when a quote is featured on more than one site. keep them as separate rows.)

* primary source site (where the site you're scraping is citing another place, perhaps even another site). This can be in HTML format if they have a link.

* origin (eg on [url removed, login to view])

* class of quote (proverb, joke, etc)

* Topic (eg computers, fear, faith)

* rating (eg on [url removed, login to view])

* number of times favourited (eg on [url removed, login to view])

* number of votes (eg on [url removed, login to view])

* URL where quote is available on the site

For people (again most of these fields will not be available for most sites, so simpler than it looks).

* person (NOT NULL) (please use exact same format as "author of the quote" above so that we can map the two tables)

* their type (celebrity, politician, etc)

* occupation(s)

* birth date

* date of death

* source site (don't worry about merging data when a person is featured on more than one site. keep them as separate rows.)

* blurb about them

* URL where description is available on the site

I'm not fussed about what technology you use or elegance of the code, provided you submit the code and you document how to install and run it, and it's runnable on a Red Hat machine without much effort. You may use your own modules, provided you submit them and allow us to use them.

Please describe

* How much you would charge to do this project

* The latest date it would be delivered.

* What experience you have writing bots.

If all goes well, there are projects you could work on with us.

Kỹ năng: Bất kì công việc gì, Perl, PHP, Python

Xem thêm: what do people charge for writing content, red hat site, quotes about writing, primary modules, perl technology, origin of writing, movies about writing, latest topic for content writing, how to null php scripts, how to format quotes from a book, how to format quotes, how to code computers, how much is the charge for content writing, how much for content writing, fear of writing, data scraping technology, content writing quotation, content writing or data, content writing on technology, content writing for us sites, blurb com, author content, ag 1 source, a format quotation, how would you describe

Về Bên Thuê:
( 1 nhận xét )

Mã Dự Án: #1881011

Đã trao cho:

xhunter12sl

Check PMB

$60 USD trong 4 ngày
(0 Đánh Giá)
0.0