Attribute based design graphs(ABDG)

These projects aim to impute missing values of the given datasets. You have to write a code in the programming

language of your choice (e.g., MTLAB /or/ Python /or/ R /or/ FORTRAN /or/ C /or/ C++) to read some excel data

(step-1), identify the missing data (step-2), and then impute the missing values in the data based on the technique

given in the proposed reference for this project (step-3), consequently, return the imputed data and compare it

with the complete data to measure the accuracy and reliability of your results (step-4).

In the step 1, do not limit your code to a specific data size or data dimension, I mean you have to be able to read

or load the data with different size and dimension. You will receive some datasets with numerical/categorical

attributes in XLS and/or CSV format, I will upload later!

In the step 2, you discover the number and the location of the missing data. For instance, if you return the missing

indices, you are able to discover the missing data patterns (univariate, monotone, arbitrary missing data). Then not

only you can successfully handle the next step, but also you gain more points!

In the step 3, you have to read the reference paper given for the proposed method and understand the algorithm

and try to write a code to impute (i.e., single or multiple) the missing data based on the given approach.

In the step 4, you have to manage your code to return the imputed values. Then you are able to compare the

imputed values with the original complete data to compute the error (NRMS). You can automatically or manually

generate some diagrams to present and compare your results with the original complete datasets.

Every step has its own credit and the successful and unsuccessful projects will be considered into account.

However, I expect the clear and commented (to some extend) programming where we are able to execute your

code easily, see and check your results (preferably by means of a visualization technique of your choice) and

trustful and reliable results.

Kĩ năng: Lập trình C, Lập trình C++, Excel, Python, Kiến trúc phần mềm

