Wow! What an amazing project! I'm a computational linguist with over 20 years of experience in NLP and machine learning. I'm an expert at using Perl and Python to automatically parse and analyze data and I'm accustomed to using statistical analysis such as word frequency and POS to classify terms.
I love the idea of building a formal language parser because it's complex and multi-faceted. Even Grammarly (a company I have great respect for) struggles with this. Your list of formal and informal markers is fairly short, and all the items except general slang are (generally) easily identifiable via regular expression.
The first step, obviously, is to finalize the list of formal and informal markers. Of course, the list is entirely up to you, but I'd suggest adding run-ons/overuse of conjunctions, cliches/overused expressions, imprecise verbs, "waffling" language ("might be"), "filler" language ("at any rate"), subjunctive mood and opinions to the informal list, off the top of my head.
To identify slang terms, I'd start with WordNet. Yes, it's ancient, but it's Unix-based, which means it's malleable, and it's better than starting from scratch. For example, any term defined as "slang for" could be tossed in the slang list. Any verb + preposition could be tagged as a phrasal verb. And so on.
I'm out of space! So I'll just say that I'm new to Freelancer and I'll work cheap to build my rep. I'm available immediately and I promise I will knock your socks off. Thank you!