Đã hoàn thành

configure OpenSemanticSearch/SolR


We use OpenSemanticSearch (OSS) to do full text fuzzy search in documents from 3GPP (e.g. from ftp://[url removed, login to view]).

The majority of documents are in MS Word format and have semi-structured meta data that can be extracted for example using regular expressions.

We are looking for an experimented SolR/OSS developper to develop the "data enrichment" modules (assuming regex in the OSS data processing flow, tbc) that will allow to extract the following information from Word documents (see attached example):

1) Document number, e.g. (note is also filename): R1-1706960

2) Source e.g.: Huawei

3) Title: PUCCH resource allocation for HARQ-ACK and SR

4) Meeting name, e.g.: 3GPP TSG RAN WG1 Meeting #89

5) Meeting place, e.g. : Hangzhou, China

6) Meeting date, e.g.: 15 May 2017

These 6 fields shall be visible in the solR search results snippets and the fields 2), 4) 6) be available in the filter criteria (facets). Refer to print screen for current results display.

Please provide the OSS/SolR/Tika configuration files needed when tested and validated in your environment and all a fews days for reproduction and milestone payment.

thank you.


Kĩ năng: Apache Solr

Xem nhiều hơn: logo design - 08/03/2017 16:15 EST, configure solr, configure webxml tomcat server, shall development india 500 words, art director hello, configure centos server, configure active directory windows, windows server 2003 configure active directory, configure web interface cisco, configure godaddy server, configure asterisk 118, configure cisco 800 router dynamic dsl, fedora configure sendmail, configure mysql php troubleshoot, configure cisco router nat isp

Về Bên Thuê:
( 2 nhận xét ) Nanterre, France

ID dự án: #14820748

Được trao cho:


I'm interested. Relevant Skills and Experience I'm a java developer for 6 years with solr experience. Proposed Milestones €200 EUR - implementation

€200 EUR trong 7 ngày
(1 Nhận xét)