Regular Expressions

Lorenzo Mele - greenkey

What are they?

a sequence of characters that define a search pattern

[wikipedia]

/abc/
abc
abcde
abde
bcde

Why should I use them?

they're very powerful when you need to quickly find strings in big files

known patterns to analyze

http://xkcd.com/208/

How to use them today?

almost every technology and programming language

grep/egrep

sed

awk/gawk

almost every technology and programming language

grep/egrep

sed

awk/gawk

Let's start!

«Here's a ton of log files, you have to find all the error lines.»

"Here's a ton of log files, you have to find all the error lines."

«That's, fine. Now we need just the Python errors...»

«That's, fine. Now we need just the Python errors...»

«... watch out: sometimes there's a separator.»

«When I say Python, I mean Python and Jython, obviously.»

«Whwn I said Python, I meant Python and Jython, obviously.»

«Real errors have error codes...»

«Real errors have error codes...»

«... sometimes they begins with a hash...»

«... and they should be of exactly three digits.»

«Are you kidding me? Some of those lines are NOT errors!»

«Are you kidding me? Some of those lines are NOT errors!»

«This is funny... kind of. Some error lines ends with a space and that makes our system explode, don't show them.»

«This is funny... kind of. Some error lines ends with a space and that makes our system explode, don't show them.»

«The word "error" scares me, can you replace it with another one?»

«The word "error" scares me, can you replace it with another one?»

«I know they're Python errors, just remove the useless word.»

«The output is a mess, add the missing hash.»

«The output is a mess, add the missing hash.»

«That's great!»

What we learnt

  • Regexp are powerful
  • Regexp are a real mess

Some people, when confronted with a problem, think

“I know, I'll use regular expressions.”

Now they have two problems.

  • Regex are spicy
  • Try it with grep or get a regex tool
  • Do not try to do everything in one uber-regex
    (just like we are doing now)

Some advice

https://blog.codinghorror.com/regular-expressions-now-you-have-two-problems/

Q&A

Can I use regex to parse HTML/XML?

NO

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Regular Expression

By greenkey_loman

Regular Expression

  • 111