Perl Example #10
More on Pattern Matching
And Regular Expressions

About the Program

This program demonstrates additional examples of pattern matching and substitution operations using regular expressions. Some of the more common regular expression "metacharacters" used for pattern matching are outlined in the charts below.

Code Meaning
    \w    
Alphanumeric Characters
    \W    
Non-Alphanumeric Characters
    \s    
White Space
    \S    
Non-White Space
    \d    
Digits
    \D    
Non-Digits
    \b    
Word Boundary
    \B    
Non-Word Boundary
    \A   or   ^    
At the Beginning of a String
    \Z   or   $    
At the End of a String
    .    
Match Any Single Character
Code Meaning
    *    
Zero or More Occurrences
    ?    
Zero or One Occurrence
    +    
One or More Occurrences
    { N }    
Exactly N Occurrences
    { N,M }    
Between N and M Occurrences
    .* <thingy>    
Greedy Match, up to the last thingy
    .*? <thingy>    
Non-Greedy Match, up to the first thingy
    [ set_of_things ]    
Match Any Item in the Set
    [ ^ set_of_things ]    
Does Not Match Anything in the Set
    ( some_expression )    
Tag an Expression
  $1..$N  
Tagged Expressions used in Substitutions
































#!/usr/bin/perl -w

### More on Regular Expressions ###
### Pattern Matching  ###


sub print_array		# Print the full contents of the Array
{ for ($i=0; $i<=$#strings;$i++)
  {print $strings[$i], "\n";
  }
print "\n\n";
}
     
sub grep_pattern	# Print strings which contain the pattern
{ foreach (@strings)
    {print "$_\n" if /$pattern/;
     }
print "\n\n";
}

### Setting up the Array of strings

@strings = ("Two, 4, 6, Eight", "Perl is cryptic", "Perl is great"); 

@strings[3..6] = ("1, Three", "Five, 7", "Write in Perl", "Programmer's heaven");
 print_array;

Two, 4, 6, Eight
Perl is cryptic
Perl is great
1, Three
Five, 7
Write in Perl
Programmer's heaven


## Find the word "Perl"
$pattern = 'Perl';
print "Searching for: $pattern\n";
grep_pattern;
     
Searching for: Perl
Perl is cryptic
Perl is great
Write in Perl

## Find "Perl" at the beginning of a line
$pattern = '^Perl';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: ^Perl
Perl is cryptic
Perl is great

     
## Find sentences that contain an "i"
$pattern = 'i';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: i
Two, 4, 6, Eight
Perl is cryptic
Perl is great
Five, 7
Write in Perl


## Find words starting in "i", i.e. a space preceeds the letter
$pattern = '\si';
print "Searching for: $pattern\n";
grep_pattern;

Searching for: \s i
Perl is cryptic
Perl is great
Write in Perl

## Find strings containing a digit
$pattern = '\d';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \d
Two, 4, 6, Eight
1, Three
Five, 7

     

## Search for a digit followed by some stuff
$pattern = '\d+.+';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \d+ .+
Two, 4, 6, Eight
1, Three

     
## Find strings with a digit at the end of a line
$pattern = '\d+$';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \d+ $
Five, 7


## Search for a digit, possible stuff in between, and another digit
$pattern = '\d.*\d';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \d .* \d
Two, 4, 6, Eight

     
## Find four-letter words, i.e. four characters offset by word boundaries
$pattern = '\b\w{4}\b';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \b \w{4} \b
Perl is cryptic
Perl is great
Five, 7
Write in Perl

     
## Sentences with three words, three word fields separated by white space
$pattern = '\w+\s+\w+\s+\w+';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: \w+ \s+ \w+ \s+ \w+
Perl is cryptic
Perl is great
Write in Perl

     
## Find sentences with two "e" letters, and possible stuff between 
$pattern = 'e.*e';
print "Searching for: $pattern\n";
grep_pattern;
Searching for: e .* e
Perl is great
1, Three
Write in Perl
Programmer's heaven

     

#### Marking Regular Expression Sub-strings and Using Substitution
     
## Substitute "Pascal" for "Perl" words at the beginning of a line
print "Substituting first Perl words.\n";
foreach(@strings)
  {s/^Perl/Pascal/g;
  }
print_array;

Substituting first Perl words.
Two, 4, 6, Eight
Pascal is cryptic
Pascal is great
1, Three
Five, 7
Write in Perl
Programmer's heaven


## Find five-letter words and replace with "Amazing"
$pattern = '\b\w{5}\b';
print "Searching for: $pattern\n";
foreach(@strings)
  {s/$pattern/Amazing/;
  }
print_array;

Searching for: \b \w{5} \b
Two, 4, 6, Amazing
Pascal is cryptic
Pascal is Amazing
1, Amazing
Five, 7
Amazing in Perl
Programmer's heaven


## Replace any "Perl" words at the end of a line with "Cobol"
print "Substituting Final Perl \n";
foreach(@strings)
  {s/Perl$/Cobol/;
  }
print_array;
     
Substituting Final Perl
Two, 4, 6, Amazing
Pascal is cryptic
Pascal is Amazing
1, Amazing
Five, 7
Amazing in Cobol
Programmer's heaven


## Delete any apostrophes followed by an "s"
print "Substituting null strings\n";
foreach(@strings)
  {s/\'s//;  # Replace with null string
  }
print_array;
     
Substituting null strings
Two, 4, 6, Amazing
Pascal is cryptic
Pascal is Amazing
1, Amazing
Five, 7
Amazing in Cobol
Programmer heaven


## Search for two digits in same line, and switch their positions
print "Tagging Parts and Switching Places\n";
foreach(@strings)
  { $pattern = '(\d)(.*)(\d)';
    if (/$pattern/)
     { print "Grabbed pattern: $pattern   \$1 = $1   \$2 = $2   \$3 = $3\n";
       s/$pattern/$3$2$1/;
     }
  }

print "\n";
print_array;
     
Tagging Parts and Switching Places
Grabbed pattern: (\d) (.*) (\d)     $1 = 4     $2 =  ,     $3 = 6

Two, 6, 4, Amazing
Pascal is cryptic
Pascal is Amazing
1, Amazing
Five, 7
Amazing in Cobol
Programmer heaven

## Marking Patterns and using multiple times
print "Expanding Patterns, and apply more than once in the same line\n";
foreach(@strings)
  { $pattern = '(\d)';
    if (/$pattern/)
     {
       s/$pattern/$1$1$1/g;
     }
  }
print "\n";
print_array;

Expanding Patterns, and apply more than once in the same line

Two, 666, 444, Amazing
Pascal is cryptic
Pascal is Amazing
111, Amazing
Five, 777
Amazing in Cobol
Programmer heaven

## Marking things between word boundaries.  Using part of pattern 
print "Replacing words that end with n \n";
foreach(@strings)
  { $pattern = '\b(\w*)n\b';
    if (/$pattern/)
     { print "Grabbed pattern: $pattern   \$1 = $1   \n";
       s/$pattern/$1s/;
     }
  }
print "\n";
print_array;
Replacing words that end with n
Grabbed pattern: \b (\w*) n \b     $1 = i
Grabbed pattern: \b (\w*) n \b     $1 = heave

Two, 666, 444, Amazing
Pascal is cryptic
Pascal is Amazing
111, Amazing
Five, 777
Amazing is Cobol
Programmer heaves


## Sentences with three words, add "n't" after the middle word
$pattern = '(\w+\s+)(\w+)(\s+\w+)';
print "Searching for: $pattern\n";
foreach(@strings)
  { 
       s/$pattern/$1$2n\'t$3/;
  }
print_array;
     
     
Searching for: (\w+ \s+) (\w+) (\s+ \w+)
Two, 666, 444, Amazing
Pascal isn't cryptic
Pascal isn't Amazing
111, Amazing
Five, 777
Amazing isn't Cobol
Programmer heaves


## Sentences with either an "o" or an "e" in them 
$pattern = '[oe]';
print "Searching for: $pattern\n";
foreach(@strings)
  { 
       s/$pattern/x/g;   # The "g" modifyer means "global", or replace all 
  }                      # occurrences of the "o" or "e" found on that line.
print_array;
     
     
Searching for: [oe]
Twx, 666, 444, Amazing
Pascal isn't cryptic
Pascal isn't Amazing
111, Amazing
Fivx, 777
Amazing isn't Cxbxl
Prxgrammxr hxavxs

The actual program: exA.pl

The output: exA.out

dhyatt@thor.tjhsst.edu