
Closed
Posted
Paid on delivery
I have 50 high-resolution JPG/PNG pages that record Viennese street name changes—they map the historic to current ones. I need these pages converted into editable text with publication-grade accuracy. Because the files are already digital, no physical scanning is required; the job is all about precise OCR and meticulous proofreading. You may use any robust engine you prefer—ABBYY FineReader, a finely-tuned Tesseract model, or similar—but the end result must be spotless German text, including every umlaut and ß exactly as printed. The layout is simple: two columns per entry, so please preserve that structure. Add any additional information and problems you encounter. Deliverables • A UTF-8 Excel or CSV file with columns: old_name | new_name | page_number • A fully searchable PDF that mirrors the original pages for reference Acceptance criteria • ≥ 99 % character accuracy when compared to the source images • All 55 pages accounted for, with no omissions or duplicates • Correct diacritics and exact spellings for every street name Once I verify a random sample against the originals and it meets these standards, the project is complete. Its formatting rules are more complex than it seems. The only fixed thing is the 6-column structure: before after before after before after (except perhaps some shifts due to bad alignment of the scan) Other than that a curly bracket shows which old names map to new names, the curly bracket can be on both sides. Basically the curly brackets determine how the page should be ocr-scanned deterministically. But note, that any approach to eager to rasterize row height seems to fall apart. So output should be a mapping between old and new names and numbers, so multiple values to multiple values. You will have to spot-check regularly to control the AI's output (I had the best results with Gemini, but it still was off occasionally, so I had to babysit it)
Project ID: 40382231
7 proposals
Remote project
Active 26 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
7 freelancers are bidding on average ₹21,381 INR for this job

With regards to your project needs, I find that my extensive skills in data entry, data processing, Excel handling, PDF conversion and proofreading matches perfectly. Having dealt with similar OCR projects before, including those with specific language requirements as yours, I am confident I can provide a quality output aligning to the intricate nature of your Viennese street documentations. Accuracy is key here and that's why we follow a stringent quality assurance process in all our tasks. With your acceptance criteria firmly defined, we have an exceptional focus on delivering beyond the required standards. To further streamline the process, we can employ appropriate collaboration and project management tools ensuring effective real-time communication between us
₹25,000 INR in 7 days
6.8
6.8

Hi there. I am confident in my ability to handle your Viennese OCR project with utmost precision and attention to detail. Throughout my career, I have consistently maximized my skills to deliver accurate results while completing tasks efficiently. My meticulous approach to proofreading and formatting aligns perfectly with what you seek for your project. Having worked on translating scientific and technical texts, my disciplined eye would ensure that every name conversion is accurately reflected - preserving the structure of two columns per entry. By hiring me, you are ensuring top-quality work showcased not only through the UTF-8 Excel or CSV file but also with a fully-searchable PDF mirroring the original pages for any future reference. In summary, my skills in proofreading, formatting, data processing along with my talent in German and Austrian language comprehension make me an ideal fit for this OCR project. Rest assured that I'll adhere strictly to your acceptance criteria and our project will be deemed complete once you've verified a random sample against the originals and it meets all your standard
₹16,666.70 INR in 10 days
3.7
3.7

Hi, I’m Manish Saini from PrimePixel. I’ll handle precise OCR and manual proofreading of your Viennese street-name pages, carefully preserving complex mappings (including curly-bracket relationships) and ensuring flawless German text with correct umlauts and ß, while structuring everything into a clean, accurate dataset. You’ll receive a well-validated Excel/CSV and a searchable PDF, with consistent spot-checking to guarantee ≥99% accuracy.
₹30,000 INR in 4 days
0.0
0.0

Before I quote this confidently, one thing I need to know: is the text printed or handwritten? Printed German in high-res scans is a solved problem with Tesseract's deu model and proper pre-processing. Handwritten changes the pipeline entirely and would shift the timeline and price. Assuming printed: I'd start with deskew and contrast normalization, since old street directory scans typically have uneven exposure and slight page curl. Tesseract runs with the German language model, and anything below a confidence threshold gets flagged for spot-check rather than forcing a full proof pass. You get clean output and a short list of uncertain lines, not a wall of corrections to work through. Two things that'd help before I scope the final delivery: what output format do you need (Excel, CSV, JSON), and do you need both the historical street name and modern equivalent, or just the current spelling? INR 28,000, 7 days.
₹28,000 INR in 7 days
0.0
0.0

Dear Client, I’ve carefully reviewed your project requirements, and this is exactly the kind of precision-focused OCR and data structuring work I specialize in. You’re absolutely right—this task is not just about OCR, but about controlled interpretation of structured content, especially with complexities like: Multi-column (6-column) layout: before / after repeated Curly bracket logic indicating many-to-many mappings Potential misalignment from scans Strict requirement for German orthographic accuracy (umlauts, ß, spacing, capitalization) My Approach I will handle this project in a multi-stage, accuracy-first workflow: 1. OCR Engine Selection & Optimization Use a combination of ABBYY FineReader and a tuned Tesseract (German language model) Run comparative OCR passes to minimize engine-specific errors 2. Layout-Aware Parsing Avoid naive row-based extraction (as you correctly pointed out) Reconstruct entries using: Column positioning Bracket logic detection Contextual grouping of street names 3. Mapping Logic Implementation Interpret curly brackets to correctly map: One-to-one One-to-many Many-to-many relationships Normalize into clean output format: old_name | new_name | page_number 4. Manual Proofreading (Critical Step) Line-by-line verification Special attention to: Umlauts (ä, ö, ü) ß vs ss Historical vs modern spelling distinctions 5. Quality Control Random and systematic spot checks Cross-validation between OCR outputs.
₹12,500 INR in 7 days
0.0
0.0

Hello, I would be a strong fit for this project because I can combine OCR-supported extraction with careful manual proofreading in German to reach publication-grade quality. I understand that this is not just about converting images into text, but about delivering clean, editable, reliable output with correct umlauts, ß, punctuation, formatting consistency, and preserved structure. I can also keep the original two-column logic and flag any unclear passages or damaged sections in a separate notes file. My approach would be: extract the text efficiently, manually proofread page by page, verify names and spelling consistency, preserve layout as closely as possible, deliver a clean final file ready for further use. I work in a very detail-oriented way and know that projects like this are won or lost in the QA step, not in the OCR step. If useful, I can also do 1 sample page first so you can judge the quality before proceeding. Although I am new on Freelancer, I take deadlines and accuracy seriously and would aim to deliver a result that earns your confidence and a strong review. Best regards Sergej W.
₹25,000 INR in 3 days
0.0
0.0

Rajasthan, India
Member since Apr 14, 2026
$15-25 USD / hour
$250-750 USD
$18-25 CAD
₹12500-37500 INR
₹1500-12500 INR
$8-15 USD / hour
₹400-750 INR / hour
$30-250 USD
₹12500-37500 INR
₹1250-2500 INR / hour
₹400-750 INR / hour
€250-750 EUR
₹400-750 INR / hour
$250-750 USD
₹600-1500 INR
$8-15 CAD / hour
$2-8 CAD / hour
$250-750 USD
₹1500-12500 INR
€250-750 EUR