
In Progress
Posted
Paid on delivery
Project Type Fixed Price (Phase 1 Prototype) Potential long-term work after Phase 1 Project Overview We are building a commercial prototype (not a research project) that predicts how water quality changes after installing nanobubble water treatment units, and recommends the number of units required before installation. We already have real-world historical lab results (before/after installation) across multiple sites and water types. Most data exists as PDF lab reports with monthly measurements. Your job is to convert these datasets into a usable time-series format, build baseline + physics-informed models, and produce a first working “unit sizing engine”. Phase 1 Deliverables (Must Deliver) 1) Data Pipeline (PDF → structured dataset) a) Extract water quality values from PDF lab reports b) Normalize parameters + units (DO, ORP, BOD, COD, turbidity, etc.) c) Create a clean time-series dataset per site/month d) Tag before vs after installation e) Output a reusable dataset schema (CSV/Parquet + clear dictionary) 2) Baseline predictive model (benchmark) a) Build a baseline time-series predictor for post-installation evolution b) Report validation performance and feature importance c) Must generalize to held-out sites (not just fit the same site) 3) Hybrid physics-informed / PINN-style model a) Implement physics constraints realistically (guardrails, not overkill) b) Compare baseline vs PINN/hybrid generalization c) Document assumptions and failure modes clearly 4) Unit Sizing Recommendation Logic (core output) a) Translate predicted curves into a recommended number of units b) Provide uncertainty handling (confidence bands or reliability rating) c) Must be explainable for non-technical stakeholders 5) Handover package a) Clean Python codebase + README b) Validation report (results, risks, limitations) c) “Plain English” summary for business stakeholders Tech Stack (Preferred) 1. Python 3.10+ 2. PyTorch 3. PINN tooling: DeepXDE or custom PINN in PyTorch 4. pandas, numpy, scikit-learn 5. PDF extraction: pdfplumber + (Camelot/Tabula if needed) Optional: MLflow or W&B for experiment tracking Optional: FastAPI or Streamlit for demo Required Experience (Non-Negotiable) Proven experience with time-series forecasting on messy real-world datasets Hands-on experience with Physics-Informed ML / PINNs (show work, not theory) Strong Python engineering (clean modular code, reproducible experiments) Experience building validation frameworks (train/test split by site, not random rows) Comfortable working with incomplete/inconsistent reports and documenting assumptions Nice to Have 1. Environmental / water treatment / hydrodynamics background 2. Experience integrating external signals (weather, tides, seasonality) 3. Uncertainty estimation (prediction intervals, confidence scoring) 4. Packaging models for use by non-technical teams What Success Looks Like (Go / No-Go) Phase 1 is successful if: 1. Model can generalize to held-out sites 2. Unit recommendation is directionally correct and explainable 3. Assumptions and failure cases are clearly documented 4. A repeatable pipeline exists for adding future project data To Apply (Mandatory) Reply with: 1. One example where you used PINNs / physics-informed ML (link or summary) 2. Your approach to handling PDF lab reports → clean dataset 3. How you would prevent overfitting across sites 4. Estimated time to complete Phase 1 5. Your fixed price for Phase 1 prototype ⚠️ Applications that only include generic AI buzzwords will be ignored. We want a builder who can deliver a working prototype.
Project ID: 40172180
23 proposals
Remote project
Active 4 mos ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
23 freelancers are bidding on average ₹25,863 INR for this job

Hello, I trust you're doing well. I am well experienced in machine learning algorithms, with nearly a decade of hands-on practice. My expertise lies in developing various artificial intelligence algorithms, including the one you require, using Matlab, Python, and similar tools. I hold a doctorate from Tohoku University and have a number of publications in the same subject. My portfolio, which showcases my past work, is available for your review. Your project piqued my interest, and I would be delighted to be part of it. Let's connect to discuss in detail. Warm regards. please check my portfolio link: https://www.freelancer.com/u/sajjadtaghvaeifr
₹45,000 INR in 7 days
7.3
7.3

Hi, I’m a data scientist / ML engineer with hands-on experience building time-series models on messy industrial data and physics-informed ML (PINNs) that generalize to unseen sites. This project closely matches my recent work. 1) Physics-Informed ML Experience I build hybrid PINN-style models where domain knowledge acts as soft guardrails, not rigid PDE solvers. Typical setup: Data-driven backbone (LSTM / Temporal CNN / Neural ODE) Physics constraints (bounds, monotonicity, mass-balance–style penalties) added to loss Explicit comparison vs strong baselines on held-out sites Focus is on generalization and explainability, not academic overfitting. 2) PDF Lab Reports → Clean Time-Series pdfplumber for text + tables; Camelot/Tabula only where reliable Parameter & unit normalization (DO, ORP, BOD, COD, turbidity, etc.) Monthly site-level time indexing Clear tagging: pre- vs post-installation Output: reusable CSV/Parquet schema + data dictionary All assumptions and gaps are explicitly logged. 3) Preventing Overfitting Across Sites Train/validation split strictly by site Global models with site-invariant features (+ optional embeddings) Leakage detection via strong baselines Early stopping on unseen sites + targeted failure analysis 4) Timeline (Phase 1) 2 weeks total (pipeline → baseline → hybrid PINN → unit sizing + docs) I focus on working commercial prototypes, clean Python code, and explainable outputs for non-technical teams.
₹25,000 INR in 7 days
4.4
4.4

Hi there, This project aligns with work I have already delivered in physics-constrained forecasting and messy industrial time-series pipelines. I recently supported an applied engineering team by turning PDF lab reports into site-level time series and delivering a prototype model that generalized across unseen assets. My experience covers PDF extraction, unit normalization, time-series modeling, and PINN-style constraints using Python, PyTorch, pandas, scikit-learn, and pdfplumber. I handle end-to-end delivery, including data validation, model benchmarking, and a clean handoff with documented assumptions and limits. I reduce risk by enforcing site-level train/test splits, simple physics guardrails, and explicit uncertainty reporting instead of overfitting complex models. I can start immediately and stay accountable through completion. Regards Chirag
₹25,000 INR in 7 days
4.4
4.4

Hi there, I’m excited to help build your water quality prototype and unit sizing engine. I specialize in time-series forecasting on messy real-world datasets and have hands-on experience with Physics-Informed Neural Networks (PINNs). I’ll extract data from PDF lab reports using pdfplumber/Camelot, normalize units (DO, BOD, COD, turbidity, etc.), and generate clean time-series datasets per site/month, tagged before vs after installation. I’ll build a baseline PyTorch model and a hybrid PINN to generalize across sites, with uncertainty estimates and explainable outputs. Predictions will translate into recommended unit counts, packaged in clean Python code with a plain-English summary for stakeholders. I prevent overfitting by splitting data by site, applying regularization, and leveraging physics constraints. Estimated Phase 1 delivery: 3-4 weeks, fixed price 35k inr. I’ve implemented PINNs for environmental systems, improving generalization while respecting real-world physics. I can deliver a working, explainable prototype that meets all success criteria and provides a repeatable pipeline for future data. Regards, Ahmad
₹35,000 INR in 7 days
4.4
4.4

As a qualified and experienced data scientist, I would be a perfect fit for your Applied ML Engineering project. Over the years, I've successfully handled various projects similar to yours, applying machine learning techniques to real-world datasets, just as you require. From my portfolio, you'll find compelling evidence of my competency with Physics-Informed Machine Learning (PINN) algorithms; I've built multiple robust models justifying its immense strength in solving complex problems. Overfitting is always a concern when dealing with multiple sites. Drawing from my extensive experience with validation paradigms in predictive modeling - specifically a focus on train/test split by site instead of random rows - mitigating against such risks will come as second nature to me. Foreseeing potential issues within core phases of your project like minefield dataset, assumptions documentation, and generalized model application are areas we both appreciate. Throughout Phase 1, rest assured you'll be working with a proactive professional dedicated to transparency and accountability throughout every stage of the process.
₹15,000 INR in 5 days
3.7
3.7

Hello there, I reviewed your project Applied ML Engineer (PINN + Time-Series) — Water Quality Prediction + Unit Sizing Prototype and understood the requirements at a high level. I focus on delivering clear, stable, and maintainable solutions aligned with the actual scope, I can work with Python, PDF, Machine Learning (ML) and follow a clean development process with proper structure and error handling. If this aligns with what you’re looking for, please come to chat to discuss further. Best regards
₹12,500 INR in 7 days
3.8
3.8

⭐ Hello there, My availability is immediate. I read your project post on Python AI/ML Developer for Water Quality Prediction + Unit Sizing Prototype. We are experienced full-stack Python developers with skill sets in: Python, Django, Flask, FastAPI, Jupyter Notebook, Selenium, Data Visualization, ETL AI/ML & Data Science: Model development, training & deployment, NLP, Computer Vision, Predictive Analytics, Deep Learning React, JavaScript, jQuery, TypeScript, NextJS, React Native NodeJS, ExpressJS Web App Development, Web/API Scraping API Development, Authentication, Authorization SQLAlchemy, PostgresDB, MySQL, SQLite, SQLServer, Datasets Web hosting, Docker, Azure, AWS, GCP, Digital Ocean, GoDaddy, Web Hosting Python Libraries: NumPy, pandas, scikit-learn, TensorFlow, PyTorch, etc. Please send a message so we can quickly discuss your project and proceed further. I am looking forward to hearing from you. Thanks
₹36,200 INR in 10 days
4.2
4.2

I understand this is a commercial prototype, not a research exercise. I can build a complete Phase-1 pipeline: extracting messy PDF lab reports into clean site-wise time-series data, developing a robust baseline model, and implementing a physics-informed hybrid model that generalizes to unseen sites. My focus will be on site-level validation (no data leakage), explainable predictions, and a practical **unit-sizing recommendation engine with uncertainty handling. I’ll deliver a clean Python/PyTorch codebase, documented assumptions, and a plain-English summary for stakeholders. Estimated timeline: 7 days Budget we can discuss. Happy to discuss prior PINN-style work and implementation details.
₹35,000 INR in 7 days
2.9
2.9

Hello Dear Client, i am not pro but i have good command in python and i have done various ML data analysis using python and my last project was also through python analysis and things like that py type files so that i can do this i think if i got the clear details and i am also flexible with budget than others if you are interested inbox me
₹15,000 INR in 7 days
2.2
2.2

Hello, I’ve carefully reviewed your project requirements and clearly understand the tasks involved. I have 13 years of experience and strong expertise in the exact skills this project requires. I have successfully delivered similar projects before and can share relevant samples if needed. I will complete this within your expected timeline while maintaining quality and clear communication. I look forward to working with you and contributing sincerely to your project’s success.
₹25,000 INR in 7 days
2.6
2.6

Hi, thanks for sharing the details. I’ve worked with Tesseract OCR on document images where strong preprocessing is critical, especially for non-Latin scripts like Devanagari. Extracting handwritten text from photographs (not scans) requires careful handling of noise, lighting variations, skew, and background artifacts, which is where most OCR pipelines fail. For this task, I would focus first on image preprocessing using techniques like grayscale conversion, adaptive thresholding, denoising, contrast enhancement, and region-of-interest cropping to improve handwritten text recognition in Marathi. I’ve used Tesseract’s Devanagari language models and tuned configurations to get more stable results on real-world photos rather than clean documents. The extracted fields will be structured cleanly and exported to CSV format, with consistent column mapping so the output is easy to review and process further. I also make sure the pipeline is repeatable, so additional ration card images can be processed the same way without manual intervention. I can share sample outputs and screenshots from similar OCR work (handwritten and regional-language documents) during our discussion. Happy to review a few sample images first to confirm accuracy expectations before proceeding.
₹30,000 INR in 3 days
2.0
2.0

Hello, I’m Ankur, a freelance developer with a dedicated team of professionals. I read all your requirements for website and I assure you that I will provide high-quality work at the proper time. Additionally, we also provide you 3 months of support from our side. As a Full Stack Developer, I specialize in Web and App Development, boasting a portfolio of stunning projects with top-notch UI/UX design. My expertise spans Flutter (for both Android and iOS), PHP, and WordPress, and I bring over 7 years of experience to the table. Whether it’s websites, applications, or e-commerce platforms, I’ve got you covered. But I’m not limited to just coding. My skill set extends to graphic design and logo creation, offering you a one-stop solution for all your project needs. With a track record of over 500 completed projects, I am committed to delivering nothing short of excellence. My ultimate goal is your complete satisfaction. Thank you for considering me for your project. I’m ready to transform your vision into a reality that stands out in today’s competitive landscape. Best Regards, Ankur Hardiya
₹25,000 INR in 7 days
0.2
0.2

Hello, I have delivered physics-informed ML solutions, including a PINN model for fluid dynamics that incorporated physical constraints as soft guardrails to improve generalization beyond training sites. I can share detailed documentation from that project on request. For PDF lab reports, I use pdfplumber to extract text and tables, combined with Camelot for tabular data. I write robust parsers to normalize units and parameters, handling inconsistencies by cross-validating multiple extraction methods and building custom dictionaries for parameter mapping. To avoid overfitting across sites, I design train/test splits by site, not by random rows, and use regularization plus early stopping. I validate performance on held-out sites and compare physics-informed and baseline models to ensure robustness. Phase 1 would take 4-5 weeks, including data pipeline, baseline and PINN model, unit sizing logic, documentation, and validation. My fixed price for the prototype is $7,500. I look forward to discussing specific dataset details and physics constraint assumptions to deliver a practical, explainable unit sizing engine.
₹37,500 INR in 7 days
0.0
0.0

I’ve reviewed your requirements and clearly understand this is a commercial Phase-1 prototype, not research focused on turning messy PDF lab data into a generalizable, explainable unit-sizing engine for nanobubble water treatment. Main focus areas: • PDF lab report extraction → clean time-series datasets • Site-aware normalization, tagging (pre/post installation) • Baseline + physics-informed (PINN-style) modeling • Cross-site generalization & validation (hold-out sites) • Explainable unit-sizing recommendations with uncertainty Process flow: Quick chat / call → PDF data audit → Extraction & schema design → Baseline model → PINN / hybrid constraints → Unit sizing logic → Validation & documentation → Handover I’ve worked on messy real-world time-series problems and physics-informed ML systems, building reproducible pipelines, guarding against site leakage, and documenting assumptions and failure modes for non-technical stakeholders. Phase-1 can be completed in 2–4 weeks, with a fixed-price prototype and clean handover suitable for future extensions. I focus on working models, clear validation, and decision-ready outputs happy to share PINN examples and walk through my approach on a quick call.
₹30,000 INR in 20 days
0.0
0.0

Hello, I’m a Python/ML engineer experienced in time-series modeling, messy real-world data pipelines, and physics-informed ML systems. I understand this is a commercial prototype, not a research experiment, and my focus will be on building a working, explainable, and generalizable solution. 1) PINN / Physics-Informed Experience I’ve built hybrid ML models combining data-driven predictions with physical constraints (bounded dynamics, conservation rules, monotonic trends). These models improved generalization on unseen environments and prevented unrealistic predictions. 2) PDF - Clean Dataset Approach I’ll build a robust ETL pipeline using pdfplumber + Camelot to extract lab parameters (DO, ORP, BOD, COD, turbidity, etc.), normalize units, tag before/after installation, and produce structured time-series datasets with a reusable schema and data dictionary. 3) Preventing Overfitting Across Sites Train/test split by site (not random rows) Leave-one-site-out validation Regularization + physics-based guardrails Cross-site performance reporting 4) Timeline Phase 1 delivery in 2 weeks.
₹20,000 INR in 14 days
0.0
0.0

With a proven track record in Python and a wide range of relevant skills, I am confident in delivering the robust and efficient prototype you seek. In previous projects, I have leveraged Physics-Informed ML or PINN to solve scientific problems that mirrored your real-world challenge. Specifically, I tackled a computational fluid dynamics problem by building and deploying a custom PINN model using PyTorch, which lead to precise and reliable predictions. This expertise would translate excellently to your task as careful consideration of variables is required for both PINN and nanobubble water treatment units. Regarding the data pipeline from the PDF lab reports, my proficiency with python library 'pdfplumber' coupled with my meticulous approach to handling data will allow for accurate extraction, normalization, and creation of structured time-series datasets following a clear schema. Prioritizing transparency, I would extensively document every assumption made in the cleaning process and maintain my documentation accordingly as new data feeds into the pipeline. Minimizing overfitting across sites is key in any generalized model's success. I propose an innovative train/test split strategy – not at random rows, but based on site locations – to ensure site-specific information doesn't bleed into other sites' validations. As for the deliverables you required within Phase
₹25,000 INR in 7 days
0.0
0.0

Hello, I’m a senior ML engineer with real, shipped experience in PINNs, physics‑informed forecasting, and messy time‑series pipelines. This project is exactly in my wheelhouse. 1. PINN Experience Built PINN models for diffusion–reaction and hydrodynamic systems using PyTorch + DeepXDE, outperforming data‑only baselines on unseen locations. 2. PDF → Dataset Approach pdfplumber + Camelot → unit normalization → site/month time‑series → install-date tagging → clean CSV/Parquet schema. 3. Preventing Overfitting Strict site‑level splits, physics constraints as priors, regularization, and cross‑site residual checks. 4. Timeline & Price for Phase1 4–6weeks/$300USD (final after sample PDFs). I build working prototypes, not academic demos. Ready to start immediately.
₹20,000 INR in 30 days
0.0
0.0

Hi there, I'm Zaheer Mahomed and I'm excited about the opportunity to work on your Phase 1 Prototype project focused on predicting water quality changes post-installation of nanobubble water treatment units. I have a strong background in time-series forecasting on real-world datasets, hands-on experience with Physics-Informed ML/PINNs, and expertise in Python engineering for clean, modular code. One strategy I propose is to leverage my experience in handling incomplete and inconsistent reports by implementing a robust data normalization process to ensure accurate dataset creation. I am available to discuss further and look forward to contributing to your project's success. Thank you for your time and consideration. Best, Zaheer Mahomed
₹23,650 INR in 30 days
0.0
0.0

As someone who specializes in full-stack and Python development with over 5 years of experience, I have the breadth of expertise required to handle all the technical aspects of this project effectively. While my professional focus has been mainly in frontend, backend, and mobile app development, I've also honed strong skills in data management and processing using technologies like pandas, numpy, pdfplumber (a tool you mention), and machine learning libraries like PyTorch - a preferred stack for this project. My recent involvement in a complex time-series forecasting project involving noisy real-world datasets gave me firsthand experience on how to combat common issues like incomplete and inconsistent reports, through careful data cleaning and leveraging appropriate algorithms. Furthermore, I am well-versed in tackling the very challenge your project poses: implementing PINNs (Physics-Informed Neural Networks) in real-world systems. In fact, I've previously used a custom PINN model for hydrodynamic simulations that yielded tremendous results - something I'd love to share more about during our conversation. To address your specific concerns like overfitting across sites and explaining model predictions to non-technical stakeholders, my approach is to develop a rigorous validation framework by explicitly segmenting train-test dataset by site rather than random rows. This meticulousness helps ensure that the models generalize better to new samples from unseen sites.
₹20,000 INR in 7 days
0.0
0.0

Delhi, India
Payment method verified
Member since Nov 29, 2023
₹1500-12500 INR
₹1500-12500 INR
₹37500-75000 INR
₹12500-37500 INR
$30-150 USD
₹600-1500 INR
£250-750 GBP
₹600-1500 INR
₹600-1500 INR
₹50000-70000 INR
min €36 EUR / hour
₹600-1500 INR
£2-5 GBP / hour
£250-750 GBP
$10-30 USD
£2-5 GBP / hour
min €36 EUR / hour
$10-30 USD
₹50000-70000 INR
$10-30 USD
£250-750 GBP
min €36 EUR / hour
£2-5 GBP / hour
₹50000-70000 INR
₹600-1500 INR