Đã Đóng

Search for headings in pages in a PDF using python

I want to extract titles from pdf pages and match them with a search query. See attached file for an example.

In the attached file, if I search for "Balance Sheet", the code should be able to return page 232.

So input will be a string and output will be a page number (integer value).

Note that "balance sheet" would be at multiple locations but we want to return only those pages in which it is in the title.

If you have previously used pdfminer then this should be easy for you. I'm open to other core languages like Java.

You can also explore pdftitle library, if that works.

Important thing is speed and accuracy. We tried doing it with PyPDF but it is not so accurate. So keep that in mind.

We can provide many other example documents if needed.

Kĩ năng: Python, Khai thác dữ liệu, PDF, Java

Về Bên Thuê:
( 2 nhận xét ) Gurgaon, India

ID dự án: #32749279

14 freelancer chào giá trung bình₹24821 cho công việc này

(130 Nhận xét)
7.0
(37 Nhận xét)
6.6
(91 Nhận xét)
5.8
suyashdhoot

Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp Thêm

₹35000 INR trong 7 ngày
(38 Nhận xét)
6.0
VladProkopchuk

Hello Sir! I think I'm a great fit for this project because I have an interest in your project and can deliver on time, according to your specifications

₹25000 INR trong 7 ngày
(10 Nhận xét)
4.4
JaibhanSinghGaur

Hello sir, I can make this for you. I am a python developer with more than 2 years of experience. I have done many projects in past. I can work on : 1. Web Scraping / Data Science / ML 2. Django 3. APP development 4. Thêm

₹12500 INR trong 2 ngày
(37 Nhận xét)
4.6
(3 Nhận xét)
4.5
RomanRut

Hello, sir I've read your job posting carefully. I will search the title from pdf successfully. Here are my python skills - Data Visualization (Cryptocurrency trading bot, stock prediction, Prediction Algorithm for Spo Thêm

₹37500 INR trong 3 ngày
(1 Nhận xét)
2.6
yinshu2020

----------------Professional Python & PDF Processing Expert! Best Result in Time!----------- Dear sir. I've read your project description very carefully. I've extensive experience in Python & PDF Processing, so I belie Thêm

₹25000 INR trong 7 ngày
(2 Nhận xét)
2.1
MUKUND12

I want to volunteer for your project of encoding and decoding. If you feel I am worth it you can give it a try. I will share the image of output for your confidence and then only ask for payment. If you want you can Thêm

₹35000 INR trong 7 ngày
(0 Nhận xét)
0.0
mldlaids

Hi. I am a data scientist. I am very familiar to Deep learning apis such as Tensorflow and fastai, mxnet. I have a good hands on working with Advanced R and Python and BI tools and technologies, AI, Big Data. I have qu Thêm

₹25000 INR trong 7 ngày
(0 Nhận xét)
0.0
nsv91

I am a software developer and will be able to do the above mentioned task in 7 days.

₹15000 INR trong 7 ngày
(0 Nhận xét)
0.0
aakaakar

We can build this using tesaract and open cv , using NLP we can also use pdf miner We can alterativelt also use AWS textextract

₹25000 INR trong 7 ngày
(0 Nhận xét)
0.0
HafetzAzahari

I am expert in data entry, typing, editing etc. if you hire me for this project, I will assure you that I will complete it on time. Thank you.

₹25000 INR trong 7 ngày
(0 Nhận xét)
0.0