I am looking for somebody who has an existing experience / code to parse Resumes data to XML. We dont want Freelancer to start from scratch. Please let us know your previous work detail in this.
The code has to be in C# and output required is in XML.
I do not need any further job like E-mail piping, saving in db etc.
Details as below -
This parser will be used to parse thousands of UNSTRUCTURED resumes in html, word (doc, docx), rtf, text and pdf formats.
Input: Resume files in the following formats: WORD, PDF, TEXT, TIF, html
Output: XML format files of the resume when all the words from resume are located in the correct tag of the XML.
The parser needs to be able to extract the following data from the resumes:
. first name
. last name
. zip code
. citizenship/immigration status
. email address
. resume job category
. resume title
. career objective or background
. years of professional experience
. employment history & dates
. education history
. licenses and certifications
. skills keywords
. phone number
. preferred job location
May be more which will be discussed in chat
Output of the parser should be an xml tagged file, one xml file for each parsed resume, output file name to be the same as the input file name with extension changing from [url removed, login to view] to [url removed, login to view]
We will supply a sample set of resumes, as many as you need to be successful.
Resumes are unstructured so formats and content vary widely. The ability to score the parsing performance is a must. Eg: The Word "2 Years" can be written in many ways like "2 Year, 2 Yr, 2 Yrs, 2Y, Two Year, Two Years, Two Yrs etc.." The system should identify all possibilities, infact we should have the ability to keep on adding matching keywords so parser keeps on becoming intelligent.
It would be helpful to be able to look at a parsing report (i.e. The application should contain a log file) that indicates which resumes the parser thinks it did poorly on so we can manually revisit those parsed resumes that have the highest probability of having parsing errors.
We need to be able to integrate the web application parser with our existing C# based application.
Ideally person who has done this job can give us an immediate solution. We are really not looking to do any Research neither have the time.