Đã hoàn thành

Python script for data cleaning using Pandas DataFrame

Please read all before bidding.

I need a python script to clean data from an excel file then save the clean data to another excel file.

I wrote the structure of the script, what I expect exactly and how it should be written. I also have code for reading and saving file. What is missing is the data cleaning part.

Detailed request:

#fill Regex in config file based on Type (manually)

#read confFile into config_DF

#save dataframe to modified_DF

# drop rows based on drop_rows in job file

#check the column names are exactly as in the conf file. order not important

#remove trailing and leading spaces in values

#replace cells with wrong format by empty string (check Regex column in configuration file)

# replace cells with wrong values (below min, over max, is zero ...) by empty string

#Fill missing values based on rules in conf df "Missing Value Fill Method" [using pandas, numpy and scipy only]

#implement methods for 'ma' moving average with previous and next available values from same column[first value and last value missing will be equal to closest value available], and 'lr' linear regression using sklearn and 'knn' using fancyimpute ([login to view URL])

#apply sigma filtering on all columns based in multiplier in conf file. Should generate a df of bool, inSigmaDF, where each value is True if inside the +- sigma multiplier for each column (after calculating mean and std), False otherwise. Then delete all rows in modified_DF that contain at least one False in the inSigmaDF

#save modified_DF to outputFile

#each function above should return a dataframe or zero on success and a non-zero code 1,2,3 on failure/exception.

if it works with pipeline, modified_DF should be verified after each call and in case it is an int, return the int

#use try/except. this function should not throw exception to calling function but returning a non-zero code 1,2,3 ....for different errors

#USE 'apply' and 'lamda', never loops, to perform on columns. data['date'] = data['date'].apply(lambda x: somefct(x))

# when calling functions, use pipes (pdpipe, [login to view URL]) in the form:

pipeline = [login to view URL](modified_DF)

pipeline+= [login to view URL](modified_DF, config_DF)

pipeline+=[login to view URL](modified_DF, config_DF)


outDF = pipeline(df)

mainly the job consist of writing one main function (clean) and many small functions to clean data


Other details and info required will be discussed as needed

All code should be documented (functions should have comments explain all variables and return values, and main part of the code).


Python 3.6+ should be used

Create an env to run the code in it

All python code should have [login to view URL] using pipreqs

Needed skills: Python, pandas, numpy, SciPy, sklearn

Extra skills: pdpipe, fancyimpute

Kĩ năng: Python, Xử lí dữ liệu, NumPy

Xem nhiều hơn: read csv file using python script, python script extract data web page, extracting data webpages using python, python script crape data, data web using visual basic script, python script read url data, python script extract data, python script read data text file, python script extract data website, python script data extraction csv, data cleaning using vba, python script extract web data, python script data website, python script data site, python script modify file data, python script download historical data yahoo, python script rs232 data, python send data php script, import text data excel using script, python script send email using imap

Về Bên Thuê:
( 3 nhận xét ) Beirut, Lebanon

ID dự án: #22637191

Được trao cho:


Hi, Mr Alex [login to view URL] you are going well! I checked your project carefully. I have rich experience with python develop python is one of my top skill. If you give me all data for the project, I will start working immediate Thêm

$50 USD trong 1 ngày
(0 Đánh Giá)

9 freelancer đang chào giá trung bình $46 cho công việc này


It's a piece of cake for me Hi, sir Thank you for your job posting. I have enough experience with Python, VBA, Data Processing. So, I am very confident to satisfy you absolutely. Let's achieve success together. Thank Thêm

$50 USD trong 1 ngày
(16 Nhận xét)

Data scientist I have a vast experience in an array of fields and I accept new challenges. I am available for hire to work on projects. Statistics Machine Learning Deep Learning Computer Vision Natural Language Proce Thêm

$45 USD trong 1 ngày
(4 Nhận xét)

Dear sir! ⭐I am very interested in your project and I am exciting. ⭐I read your project details carefully and I though that I am the best fit developer for your project. ⭐I have rich experience about your project, so I Thêm

$60 USD trong 1 ngày
(1 Nhận xét)

hi Client. i have read and understood your request.i am interested in your project and can do it very well. I have wide experience in Python development and i am looking forward to contact me, please. i want to consult Thêm

$45 USD trong 7 ngày
(1 Nhận xét)

Hi potential client, I have read and understand your job description. I feel excited to working with you. do contact me if your interest with my service.

$45 USD trong 7 ngày
(0 Nhận xét)

Hi There, I am having 6+ years experience in IT in data science, python and machine learning concepts. I have experience in Pandas, Numpy libraries for Data Analytics. I have experience in Matplotlib, Seaborn, Plotly Thêm

$50 USD trong 7 ngày
(0 Nhận xét)

Hi i can do this.

$35 USD trong 8 ngày
(0 Nhận xét)

Hello, I have been working with python and pandas for a long time now and have delivered various projects on the same. I believe that I can successfully deliver your project.

$35 USD trong 1 ngày
(0 Nhận xét)