Đã Đóng

Word replacement engine - sophisticated content generator

Dear freelancers,

Extract: I need a system, that produces text for a given keyword. The text needs to be unique, targetted to the keyword and halfway readable.

The good news: I've a concept which will probably work.

The bad news: It's not easy to develop and requires handling large ammounts of data.

==============

Goal: A text production maschine, where I can enter a keyword and it generates a lot of texts for it. Those texts pass copyscape and look halfway legit on the first glance. I know that "sense" is not possible and I don't expect that.

Input: $keyword (i.e. "credit card")

Output: $text (string with ~400 words unique text about "credit card")

Please read the project description at least twice before you bid. THIS IS A HARD TASK. If you have any questions please let me know. The project has high priority for me and I'm 24/7 available for the developer.

==============

Project spezification:

Step 0 [preperations]: We collect large ammounts of human written content (german texts). I have a list of 1.7 million .de domains, let's crawl them (including subpages) and extract all text to a database/semantic cloud.

If you choose a database, I'd suggest mongoDB as it's ways faster than MySQL with that ammount of data. Our main business is Hosting, so we can provide you with custom server technology (like a 24 GB RAM to load parts of the cloud into memory or SSDs raids to speed up access). I've also a good crawler for webpages available, but it's written in python.

Step 1 [generation proccess]: Input Keyword by user. Generate a random number between 1 and 30. Let's assume it is 15.

Step 2: Make a google query for the keyword we want to generate text for. Parse the google result number 15, remove tags and navigation (script for this is existing) and extract remaining content.

Step 3: We now have a snippet of relevant text for our keyword. But of course, it's only a copy - this is where the rewriting begins.

Step 4 [rewriting]: Split the snippet into single sentences. A new sentence begins after .,:!?;

Step 5: Here it is getting tricky. We need to somehow find out, which words form a block. When the sentence is: "Mark studies law at harvard university.", the system needs to detect that "harvard university" is a block and "university" shall not be replaced with "school", however, "harvard university" may be replaced with "stanford". So the words in a block need to be replaced together. How do we find out which words belong together? We check how "near" they are in our word cloud, how often they stand next to each other: "Mark | studies | law | at | harvard university."

Step 5: okay, we now have the blocks. Next step is to aim for replacing as many blocks as possible in order to make the text unique. Here we query our natural content cloud something like that: "$left" * "$right".

In this practical example: query: "Mark " * "law" - as you can see, we took three following blocks and replaced the middle one with an placeholder. Our natural content database should now return legit blocks for *, as they were used in natural language, for example: "Mark teaches law", "Mark demands law", "Mark is still searching for law" etc.

Not all will perfectly make sense, but it's a start and far better than working with synonyms, because you can also replace single words with blocks of multiple words and vice versa.

We should use this replacement system multiple times in each sentence. It also works for the begin and the end of sentences. From my manual tests with google, this works pretty well and might work even better with our own datapool. The system works identically for all languages, so you don't need to speak german.

Step 6: Output rewritten text.

=====================

You can use any programming language you like.

I can't write more description text here due to [url removed, login to view], so happy bidding & discussion!

Best,

Steve

Kỹ năng: Lập trình C++, Khai thác dữ liệu, Python, Kiến trúc phần mềm, Web Scraping

Xem thêm: sophisticated word generator, sophisticated language generator, word replacement generator, sophisticated sentence generator, sophisticated text generator, sophisticated word replacement, sophisticated words generator, word can replace sophisticated, words replacing sophisticated, freelancer engine replacement, python content generator, word replacement sophisticated, replacing words synonyms, y freelancer com, written content, write sentences find, working of web crawler, working at freelancer com, working as a freelancer web developer, work has a it freelancer german, work as a freelancer goal com, which programming language is best, where to write my work in freelancer, where to start as a freelancer, where to find web developer

Về Bên Thuê:
( 29 nhận xét ) Fürstenfeldbruck, Germany

Mã Dự Án: #1077111

13 freelancer đang chào giá trung bình $2500 cho công việc này

synl0rd

I'm very interesting, Check PMB boss.

$1500 USD trong 15 ngày
(8 Đánh Giá)
5.0
sentromed

Hello. I have much experience in Mysql, Python and C++. I also can use any other database. I can code this projects, but I need to know how to do this. I may code all your ideas to see if this works or not. Regards, Va Thêm

$3000 USD trong 40 ngày
(4 Đánh Giá)
5.0
expertMan

Please check message board.

$3500 USD trong 60 ngày
(4 Đánh Giá)
4.7
LiveConnector

Please check PMB

$3000 USD trong 30 ngày
(5 Đánh Giá)
4.3
priboy

Please check PMB

$1500 USD trong 10 ngày
(4 Đánh Giá)
4.3
Dutchstudent7750

Hi. This seems like a very interesting summer project. I have a lot of experience in programming in Python, JAVA and C. I have made software that uses the Google search engine to extract data from websites. Also, I can Thêm

$2000 USD trong 60 ngày
(6 Đánh Giá)
3.0
AlexandrP

Ia am very experiences in writing text based engines in C.

$2800 USD trong 30 ngày
(1 Đánh Giá)
2.4
Hangleton

Dear sir, This is a really interesting project, close to computer language-compilation challenges. Computer compilation theory is not only used for computer languages but their application also extend to natural lan Thêm

$1500 USD trong 30 ngày
(0 Đánh Giá)
0.0
Twirlie

Hello there! This sounds like a fascinating idea, and we would love to work with you on it! Please check your PM for our bid!

$1500 USD trong 30 ngày
(0 Đánh Giá)
0.0
meisel

I'm an expert in the field of natural language theory, and a native speaker of English! Contact me for more details.

$1700 USD trong 8 ngày
(1 Đánh Giá)
0.0
Jeandasse

Hi, I would like to work on this project. I usually work on this kind of parsing project Best regards Jean

$3000 USD trong 30 ngày
(0 Đánh Giá)
0.0
TavishiSystems

We can do it.

$2500 USD trong 30 ngày
(0 Đánh Giá)
0.0
itbscompany

Hi! Please have a look at a private message.

$5000 USD trong 30 ngày
(0 Đánh Giá)
0.0