Đang Thực Hiện

137600 Named Entity Extraction

Given web articles containing textual content, we'd the following entities extracted:

* who

* where

* when

This will be around at certain points in the article which we will specify.

Your module will take in English text that we provide (html or straight flat text if you prefer) as input and return the entities at these points.

You will probably want to use existing software. If so, please specify the name of the software in your bid.

We will provide the hardware for this. You must install it on our linux servers.

If you have more questions, please don't hesitate to ask.

Some answers to questions so far:

* Hardware: would preferably run on a single P4, but may be on a Dual Xeon.. Max 2GB memory.

* It absolutely cannot be done manually. There will be millions of articles

* (See [url removed, login to view] for more info on the process.)

* We will provide the data you need (html or text) in a database. You will then update it to this database, where you can also store a flag for whether you've processed individual records or not.

Kỹ năng: Bất kì công việc gì, Lập trình C, Nhiệt hạch lạnh, Java, Perl, PHP

Xem thêm: wiki module, info millions, web extraction, php wikipedia, p4, linux article, database extraction, data extraction and input, named, dual php, info hardware, want software store data, data entity, html extraction, html wikipedia, memory database, perl php text database, wikipedia text html, english questions answers articles, questions answers english articles, update existing software, english article questions answers, flag store, articles questions answers english, english articles questions answers

Về Bên Thuê:
( 1 nhận xét )

Mã Dự Án: #1883774