Đã Hủy

Genome assembly software

I need a a software for genome assembly based on an algorithm of mine. In genome assembly the software is given millions of strings (reads) and merges two strings based on some detected overlap between them. For example:

AAGTTAAATGAGA and GTTCCCAAGTTAA will merge into:

GTTCCCAAGTTAAAAGTTAAATGAGA because AAGTTAA is a an overlapping string between them. So basically, for every set of strings with overlap between the you are looking for "the shortest common superstring" that both of them are sub-strings of it.

I have some variation to this process, which I'll explain in detail later.

I need the software to work in parallel on CUDA. The amount of data here is massive. Millions of strings, each of length 40~150. There are some known algorithms how to quickly construct overlap graphs (look for ABySS assembler for example) using hash tables, prefix graph etc.

There should be at least a basic GUI to load a file, watch the progress and get statistics about the output.

THE MAIN TASK HERE IS: The algorithm must be efficient time-wise. It should be able to process hundreds of millions of string in few hours (some algorithms for genome assembly, like SOAPdenovo are even faster than that, though they are using different approach other than aimple overlap detection [de-bruijn graph]). It is very simple to write the software, but it is not that simple to write it efficiently.

Bid only if you are willing to read about the subject (the biological background). I will provide you with a list of many similar software some of them are open-sourced so you can make use of.

I am myself a programmer and I have the code written in Perl and in Matlab (not very efficiently). It is not good enough to help you, but just to let you know that I will be able to understand your questions and to answer them.

Kỹ năng: Sinh học, Lập trình C, CUDA, Xử lí dữ liệu, Nghiên cứu khoa học

Xem thêm: genome assembly software, cuda assembler genome, shortest common superstring code, genome assembly, genome, cuda genome assembly, genome assembly matlab, which graph to use for data, use of graph, use of algorithms in programming, use of algorithms, use of algorithm in programming, strings in c programming, string programming, string processing in c, string processing algorithms, string processing algorithm, string prefix, string hash, string algorithms, string algorithm, statistics algorithms, statistics algorithm, software algorithms, simple algorithms in c

Về Bên Thuê:
( 32 nhận xét ) Haifa, Israel

Mã Dự Án: #1605443

20 freelancer đang chào giá trung bình $1270 cho công việc này

managonz

Hello, I am an expert on C/C++ and I am very experienced in optimizing algorithms. Quality job guaranteed.

$750 USD trong 15 ngày
(4 Đánh Giá)
4.3
zeke

Very interested to work on this project. Available to start immediately and finish as soon as possible. Please contact in PMB to discuss details if you are interested in the offer. Best Regards, Zeke

$1500 USD trong 10 ngày
(13 Đánh Giá)
4.1
vrts

Hi there! I am familiar with CUDA. May you explain more detailed? Thank you.

$1200 USD trong 20 ngày
(4 Đánh Giá)
4.0
rakib062

Hi, Please check private message. thanks

$2000 USD trong 30 ngày
(12 Đánh Giá)
4.1
tulebaev

Here is an excellent example of the using CUDA technology in bioinformatics: CUDASW ([url removed, login to view])

$750 USD trong 15 ngày
(11 Đánh Giá)
3.4
ahmedsamieh

i have the experince to finish your project as you requist using OpenMP or MPI or even OpenCL to get the best performanc please check your pmb

$1300 USD trong 3 ngày
(1 Đánh Giá)
2.6
jartieda

I am very interested on the project. I send you more details to your pmb

$1100 USD trong 30 ngày
(0 Đánh Giá)
4.4
Vectorecho

Greetings! I am very interested in your project and would gladly do the work you need doing. Check your inbox for details. Sincerely

$750 USD trong 14 ngày
(0 Đánh Giá)
0.0
rathore123

this is also my thesis work so u be confident about the same

$800 USD trong 10 ngày
(0 Đánh Giá)
0.0
Ashunam

I have been reading a lot about computational molecular biology for my Msc research for the past year. Sequence alignment, multiple sequence alignment, protein folding (homology based, Ab-initio based), motif identific Thêm

$2000 USD trong 30 ngày
(0 Đánh Giá)
0.0
N7j3J2pOS

Pls check PMB.

$1500 USD trong 1 ngày
(0 Đánh Giá)
0.0
v3borg

Your project looks interesting. I already worked on genomics with one US company.

$1500 USD trong 15 ngày
(0 Đánh Giá)
0.0
felimuno

im interested

$800 USD trong 10 ngày
(0 Đánh Giá)
0.0
VirtualHipster

Hi. My name is Shad Nygren. I am interested to bid on your project. I have Master's in Computer Science and for my thesis I did some similar sequence alignment and wrote code in C++ [url removed, login to view] Thêm

$1500 USD trong 30 ngày
(0 Đánh Giá)
0.0
ssb221

Dear Sir, With my previous GPU (CUDA) development experience for Multiple sequence alignment and levenstein and needleman wunsch algos I am sure to tackle your software very efficiently . Please check you personal e Thêm

$1100 USD trong 20 ngày
(0 Đánh Giá)
0.0
KireSopov

Hi, please check PMB

$850 USD trong 20 ngày
(0 Đánh Giá)
0.0
Tiayyba

PhD in Bioinformatics and excellent programing skills in C/C++ and good knowledge of CUDA.

$1500 USD trong 12 ngày
(0 Đánh Giá)
0.0
Daynix

Hello, I'm happy to bid for this project as it seems to perfectly match the combination of my skills: I have both an MS degree in genetics and 20 years of experience in software development. I'm about to use C/C++ fo Thêm

$1350 USD trong 5 ngày
(0 Đánh Giá)
0.0
HEbOy3D60

Pls check PMB.

$1500 USD trong 1 ngày
(0 Đánh Giá)
0.0
przemluk

hi, i can propose here 2 solution : 1 - creating whole aplication from begining using CUDA all in c++ ( this option will take a 7-14 days ) 2 - rebulding your Matlab aplication so it will use CUDA and speed up Thêm

$1000 USD trong 14 ngày
(0 Đánh Giá)
0.0