I have a .csv file that my company provides to me on a weekly basis. The first 2 columns in this .csv are the most important to me. There can be anywhere from 500 to 70,000 records at one time.
Column one is always domains. Example format: [url removed, login to view]
Column two is always email addresses.
I need scripts that will remove information that is not important to me.
First: There is always a domain in the first column, but the second column is sometimes empty. When this happends, delete the row and write these domains from column 1 to a new file
Second: I will provide a list of words in a file, and I will need the script to compare this list of words to the second column in .csv (the email address column). If any these words are found in any of the email addresses, these records should be removed from the .csv. The script should save all of the deleted rows in a new file.
Third: Often, the second column in the .csv (email addresses) will have more than one email address. I'm only interested in the first email address, so the script should remove all of the extra email addresses.
This can be one script that does all of this, or 3 different scripts. Either way, it's pretty simple and shouldn't take long to write.
Looking forward to your bids, thanks!