The Lounge is rated Safe For Work. If you're about to post something inappropriate for a shared office environment, then don't post it. No ads, no abuse, and no programming questions. Trolling, (political, climate, religious or whatever) will result in your account being removed.
Forget backtracking regular expressions, as they don't have the same fancy mathematical properties as their non-backtracking counterparts. Use the non-backtracking operators and there's only 5 operations to remember, concatenation, alternation, parentheses, zero or one match and kleene star (looping * - zero or more match), and concatenation is implicit.
1. Simpler to understand
2. Faster to execute
3. Weirdly mathy but in a cool way
4. The same across almost all regular expression engines
I give a primer at the end of this article. I taught them to my computer, and trust me - it's not very smart, but then I also taught it C in that article.
I enjoyed the series right up until they were able to insert a little card into their existing engine that allowed them to travel anywhere immediately. I know that FTL is imaginary, but I'd think that "blink" might have required a completely different set of physics requiring a new engine or something. My "willing suspension of disbelief" became unwilling at that point.
Outside of a dog, a book is a man's best friend; inside of a dog, it's too dark to read. -- Groucho Marx
I would rather use whatever language I am working with to perform the parse. As you just stated... regex is technically another small programming language.
I am not sure if you know this... but you can take a regular expression and use the Ragel state machine compiler[^] to convert it to C/C++, D, Go, Java, Ruby and even Objective-C. Interestingly... I do not see C# support.
The only regex(-like) syntax I felt somewhat comfortable working with was SNOBOL
That was 30+ years ago. I first met it as a 200 source lines version of Eliza, the therapist, which fascinated me immensely. Obviusly, that version never passed any Turing test, yet: Try to write anything comparable in 200 lines of any ordinary, algorithmic language! So I started playing around with it, just for fun - I never used it commercially.
Actually, not too long ago I picked up the source code of an old SNOBOL interpreter, hoping one day to port it. It is currently #43 on my project lists. Tuits are hard to find nowadays, especially round ones.
I like using regex for day-to-day, throwaway things. It's especially good for reformatting text. I'm certainly not intimidated by them.
That said, I don't think I would ever use one in product code with a long life-span. You must admit that regular expressions tend to be write-only, which is a cardinal sin against those who must maintain the code, including your future selves. Code written very concisely, and regular expressions may be the ultimate in concise, require a lot of mental unpacking during maintenance. Unless you write a ridiculous amount of comments for the expression, it might not be worth it.
Which takes some unpacking as you say, but is certainly readable.
Or I can give you a page long document of requirements around JSON number parsing.
Personally, I can read that quite easily, but that's me.
Let me propose something - there is a meaningful subset of regular expressions which are easy to understand, and can fulfill most simple lexical specifications like the above, or say, like an email address, or an url, or any number of small, structured text fragments.
To my mind that regex would be okay. It's the thousands of characters, wall-of-text abominations that I object to. I know, that's an example of poor use of regex, but it's the kind of thing you find. Inexperienced folks start using it, and all of a sudden it becomes their favorite toy.