These are the requirements for either a Java or Python developer (please note that requirements below are for Java but happy for a Python developer and equivalent libraries etc).
We are looking for a Java developer with some knowledge of RDBMS, preferably MySQL flavour. Python is a bonus, but not critical.
1. What are we building?
We want to build a text processing pipeline, which would extract specific named entities from news articles.
We already have web scrapers which download news from selected sources and store them. The next component is a text processing engine, which will process those articles and extract named entities according to a predefined set of rules.
2. Which tools/frameworks are essential for this project?
The core part of this pipeline is text processing and named entity extraction engine, so some familiarity with Natural Language Processing or genuine interest in it are important. Proposed framework for NLP is Stanford NLP ([login to view URL]).
Essential parts of this framework which will be used in solution are: Tokenizer, Parser, POS Tagger, Named Entity Recogniser, RegexNER
Frameworks/libraries which might become very handy: Hazelcast (for messaging and distributed computing) or alike;
3. Technical skills/knowledge:
RDBMS, preferably MySQL, but could be any DB. Our schema is simple.
Good knowledge of algorithms and data structures
Microservices architecture would be a bonus
19 freelancer chào giá trung bình$1973 cho công việc này
Hi , i can do the text processing using python for your project and use all sql / no sql databases for data structure . Let's talk in details if you are interested .