Regular Expressions 101

Perl, is that you!?

A bit of theory

Formal languages

\begin{aligned} L_\emptyset &= \emptyset \\ L_\varepsilon &= \{\varepsilon\} \\ L_a &= \{a\} \\ L_{RS} &= \{rs\ |\ r \in L_R, s \in L_S\} \\ L_{R|S} &= L_R \cup L_S \\ L_{R*} &= L_R \cup L_{RR} \cup L_{RRR} \cup \dots \end{aligned}

Definition

Automata Memory Time
Deterministic O(1) O(#input)
Nondeterministic O(???) O(#input * #regex)
Nondeterministic #2 O(2 ^ #regex) O(#input)

Complexity

Takeaway

A match can be found in constant memory and linear time, but the expression itself might be exponential in size. Smaller expression results in more time and memory consumption.

POSIX

  • Basic Regular Expressions
  • Extended Regular Expressions
  • Simple Regular Expressions (deprecated)

Standards

Used in most shell programs, like grep or sed.

Not very interesting.

More information

Perl Compatible Regular Expressions

Perl Compatible Regular Expressions

  • High expressiveness
  • JIT compiler support
  • Extended character classes
  • Lazy matching
  • Multiline matching
  • Named subpatterns
  • Backreferences
  • Look-ahead and look-behinds
  • Atomic grouping
  • Escape sequences for zero-width assertions
  • Recursive patterns
  • Generic callouts
  • Comments

Comparision of

regex engines

What about JavaScript?

The form and functionality of regular expressions is modelled after the regular expression facility in the Perl 5 programming language.

Examples

Task 1

Reorder arguments of setTimeout.

// In.
setTimeout(() => tree.update(pending), 100);
setTimeout(() => toast({text: 'Hello'}), 200);
setTimeout(() => refresh(() => new Date(), 100), 999);
notReallysetTimeout(() => alert('Tricky!'), 1337);


// Out.
setTimeout(100, () => tree.update(pending));
setTimeout(200, () => toast({text: 'Hello'}));
setTimeout(999, () => refresh(() => new Date(), 100));
notReallysetTimeout(() => alert('Tricky!'), 1337);
text.replace(
  ???,
  ???
);
text.replace(
  ???,
  'setTimeout($2, $1);'
);
text.replace(
  /\bsetTimeout\((.*),\s*(.+)\);/gm,
  'setTimeout($2, $1);'
);

Task 2

Transpile import().

// In.
const A = () => import('components/A').then(module => module.default);
const B = () => import('components/B').then(module => module.B);
const C = () => import('components/xx:yy/C').then(module => module.C);
const F = () => import('components/D/E/F').then(module => module.F);


// Out.
import {default as A} from 'components/A';
import {B as B} from 'components/B';
import {C as C} from 'components/xx:yy/C';
import {F as F} from 'components/D/E/F';
text.replace(
  ???,
  ???
);
text.replace(
  ???,
  'import {$3 as $1} from $2;'
);
text.replace(
  /const (.*?) = \(\) => import\((.*?)\).then\(module => module\.(.*?)\);/gm,
  'import {$3 as $1} from $2;'
);

Links

Regular Expressions 101

By Radosław Miernik

Regular Expressions 101

Vazco TechMeeting 2019-04-12

  • 928