We have data from a household survey, and want to identify major findings with regards to five questions, of the type "How comfortable would you feel doing xxxx?", and offering between two and five answer options. The data set should be filtered by country, and we want data mining across a range of variables (age, gender, settlement type, professed subjective well-being, education, social networks, and more) to unearth striking findings. We want bivariate correlations between those five variables and all others, and a big multivariate regression.
Note that the survey, which has been done to international standards across three countries, with more than 6000 respondents, had complex survey design, which is documented in detail. (Multi-stage cluster sample on nine geographical units: the capital, urban-Northeast, urban-Northwest, urban-Southeast, urban-Southwest, rural-Northeast, rural-Northwest, rural-Southeast and rural-Southwest.) Thus the specialist would need to know how to deal with weights, DEFF, and all of that. Quality more important than speed.
We are looking for someone professional, potentially with follow-up work. Adding tentative findings would be desirable, but their tentative nature should be clearly flagged.
IMPORTANT TECHNICAL NOTE: you either need STATA, R, or the very newest version of SPSS. You absolutely need to include not only weights for the regressions but full survey settings, or you risk concluding that independent variables are significant predictors of the dependent variable when they are in fact not.
More commonly used programs, like Excel or older SPSS, CANNOT be used.
Please send evidence that you have done precisely this type of work, and that you can write up the results in an accessible and cogent style.