A Haskell program with associated lexer and/or parser definition files that uses a Haskell programming library like Alex and/or Happy to parse a simple processing language that defines a set of operations that will transform a CSV file into another CSV file.
The set of operations in the processing language are used to create a column in the target file based on columns in the source file transformed like this:
1) Concatenation of columns from a source file each appended with a delimiter but with the option of excluding columns with blank values
Address := concat(True, ", " , [address1, address2, address3, county, eircode])
where True means exclude blank values
So if address1 is "84 My Lovely Street", address2 is "", address3 is "LovelyVille", county is "Shropenshire" and eircode is " " then address is "84 My Lovely Street, Shropenshire" and the address2 and eircode columns have not been included because they are blank.
If the first parameter is False instead of True, then address is "84 My Lovely Street, , Shropenshire, "
Note the final value is not appended with the delimiter
2) Take a piece of a column from the source file with the piece based on a delimiter such as white space,
FirstName := takePiece (fullName, 1, " ")
LastName := takePiece (fullName, 2, " ")
3) A "case" statement that takes a list of conditions where each condition checks that a column in the source file "matches" a value (string) based on a logical operation which is equal, not equal, begins or contains and uses a value from a column in the source file
MainNumber := case when MobileNumber <> "" then MobileNumber; when LandlineNumber begins "08" then LandlineNumber; when HomeNumber begins "08" then HomeNumber; when LandlineNumber <> "" then LandlineNumber; else HomeNumber
where "" means blank or just white space like " "
A) In all of the above when checking columns that any difference in case should be ignored and that white space and punctuation should also be ignored. So in operation 1 above if the column headers in the source file were "Address1", "ADDRESS 2", "County" and "Eir-Code" then they would match respectively the address1, address2, county and eircode columns named in the operation.
B) The CSV files will be quite small, less than 100 lines.
C) The program should do simple "all or nothing" error jandling like "Processing file is invalid" and "Source CSV file is invalid".
D) Additional error handling (report specific processing file errors and source CSV file errors in more detail) and additional processing operations (case statements where the value is not just a column but a concatenation of columns or a piece of a column) would be additional work that would be awarded to the original bidder if they do a good job on the original work.
E) In note C above "jandling" should be "handling"
F) In note C above, the source CSV file is invalid if it does not contain a column mentioned in the processing file.
G) In note C above, the source CSV file is invalid if it has more than one column that could be interpreted as the same column as is mentioned in the processing file e.g. in operation 1 if there were two columns - one called "address1" and another called "Address-1" - then that would be a source file error.
H) All source files will have a header row with the column names
I) For all operations blank means the empty string or a string that just contains white space so if a row is like
then the value in the second column is blank
Apologies to all who bid but my company don't want to proceed with doing this in Haskell. Instead they want to use an in-house resource to do this with Lark in Python. My attempt to introduce Haskell to the company has failed this time.