Đã hoàn thành

Cluster word vectors with cosine

I have vector sets, of the format:

wordA|wordB|wordC; 0.8|0.5|0.3

wordA|wordC|wordD; 0.7|0.3|0.1

wordA|wordE|wordF; 0.9|0.2|0.2

wordC|wordE|wordF; 0.3|0.3|0.1

(and 8 more)

The numbers are relevance scores for each word.

The goal is to reduce/cluster this into fewer sets. e.g., to 2 sets.

There are existing methods for clustering, e.g., cosine similarity or

[url removed, login to view]

Your input is a textfile similar to the above (my file format is slightly more complex and larger). The output should include a measure of similarity. i.e., a way to know when to stop clustering. In some cases, we will reduce to 8 clusters, in others we cluster down to 1-2 clusters.

If you are familiar with nltk or similar, this should be a simple task.

Thanks.

Kĩ năng: Toán học, Ngôn ngữ tự nhiên, Python

Xem nhiều hơn: word cosine, cosine similarity nltk, word cluster, cosine cluster, cluster word, vector in c language, word input, python task, nltk, cluster, api output python, python html include html, natural numbers, nltk python, python api, python word html, textfile, python file html, similarity measure, task python

Về Bên Thuê:
( 83 nhận xét ) Rockville, United States

ID dự án: #4735434

Được trao cho:

laituan245

Hello Sir. I have a really strong understanding in NLP. In addition, I have worked with nltk for quite a long time. Therefore, I totally can handle this task.

$55 USD trong 3 ngày
(0 Đánh Giá)
0.0

2 freelancer đang chào giá trung bình $78 cho công việc này

ouyongbin

I am an expert in mathematics and algorithms.

$100 USD trong 3 ngày
(2 Nhận xét)
4.2