Sunny Books
What we have

Common used regular expressions

Regular expressions are cumbersome but powerful. Understanding regular expressions can improve efficiency and also give you the absolute sense of accomplishment.

Here is a brief reference sheet for common used regular expressions:

Character Definition Example
^ The pattern has to appear at the beginning of a string. ^cat matches any string that begins with cat
$ The pattern has to appear at the end of a string. cat$ matches any string that ends with cat
. Matches any character. cat. matches catT and cat2 but not catty
[ ] Bracket expression. Matches one of any characters enclosed. gr[ae]y matches gray or grey
[^] Negates a bracket expression. Matches one of any characters EXCEPT those enclosed. 1[^02] matches 13 but not 10 or 12
[-] Range. Matches any characters within the range. [1-9] matches any single digit EXCEPT 0
? Preceeding item must match one or zero times. colou?r matches color or colour but not colouur
+ Preceeding item must match one or more times. be+ matches be or bee but not b
* Preceeding item must match zero or more times. be* matches b or be or beeeeeeeeee
{} Parentheses. Creates a substring or item that metacharacters can be applied to a(bee)?t matches at or abeet but not abet
{n} Bound. Specifies exact number of times for the preceeding item to match. [0-9]{3} matches any three digits
{n,} Bound. Specifies minimum number of times for the preceeding item to match. [0-9]{3,} matches any three or more digits
{n,m} Bound. Specifies minimum and maximum number of times for the preceeding item to match. [0-9]{3,5} matches any three, four, or five digits
| Alternation. One of the alternatives has to match. July (first|1st|1) will match July 1st but not July 2

Perl-Style Metacharacters

Character Definition Example
// Default delimiters for pattern /colou?r/ matches color or colour
i Append to pattern to specify a case insensitive match /colou?r/i matches COLOR or Colour
\b A word boundary, the spot between word (\w) and non-word (\W) characters /\bfred\b/i matches Fred but not Alfred or Frederick
\B A non-word boundary /fred\B/i matches Frederick but not Fred
\d A single digit character a\db/i matches a2b but not acb
\D A single non-digit character /a\Db/i matches aCb but not a2b
\n The newline character. (ASCII 10) /\n/ matches a newline
\r The carriage return character. (ASCII 13) /\r/ matches a carriage return
\s A single whitespace character /a\sb/ matches a b but not ab
\S A single non-whitespace character /a\Sb/ matches a2b but not a b
\t The tab character. (ASCII 9) /\t/ matches a tab.
\w A single word character - alphanumeric and underscore /\w/ matches 1 or _ but not ?
\W A single non-word character /a\Wb/i matches a!b but not a2b

The followings are some examples of usig regular expressions

Date and Time

Time format (no seconds): HH:MM am/pm

^([1-9]|1[012]):(0[0-9]|[1-5][0-9])\s?(am|AM|pm|PM)$

Date in mm/dd/yyyy format, with an option for m/d/yyyy (exclude zero’s)

^(0?[1-9]|1[012])[ \/.-](0?[1-9]|[12][0-9]|3[01])[ \/.-](19|20)\d\d$

Date in dd/mm/yyyy format, with an option for d/m/yyyy (exclude zero’s)

^(0?[1-9]|[12][0-9]|3[01])[ \/.-](0?[1-9]|1[012])[ \/.-](19|20)\d\d$

Phone number

Parenthesis, periods, dashes, underscore, and spaces are allowed in phone number: (123)456-7890
(123)456-7890
(123) 456 – 7890
( 123 )456-7890
1234567890
123.456.7890
123-456-7890
123 456 7890

^[\(\s\._-]*\d{3}[\)\s\._-]*\d{3}[\s\._-]*\d{4}$

Postal code

Matches Canadian postal code formats with or without spaces (e.g., "T2X 1V4" or "T2X1V4")

^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$	

Validate domain name

/^(http|https|ftp)://([A-Z0-9][A-Z0-9_-]*(?:.[A-Z0-9][A-Z0-9_-]*)+):?(d+)?/?/i

Validate email format

It verifies if the email is written in the following format:xxx@xxx.xxx

^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$

Checking password complexity

This regular expression will tests if the input consists of 6 or more letters, digits, underscores and hyphens. The input must contain at least one upper case letter, one lower case letter and one digit.

'A(?=[-_a-zA-Z0-9]*?[A-Z])(?=[-_a-zA-Z0-9]*?[a-z])(?=[-_a-zA-Z0-9]*?[0-9])
    [-_a-zA-Z0-9]{6,}z'

Highlight a word from a text

$text = "Sample text here";
$text = preg_replace("/b(regex)b/i", '<span style="background:#f00">1</span>', $text);
echo $text;

Remove repeated words (case insensitive)

'Keep your your head' becomes 'Keep your head'

$text = preg_replace("/\s(\w+\s)\1/i", "$1", $text);

Remove repeated punctuation

'Keep your head......' becomes 'Keep your head.'

$text = preg_replace("/\.+/i", ".", $text); 

Get all image urls from an html document

Puts all the image URLs in an array.

$images = array();
preg_match_all('/(img|src)\=(\"|\')[^\"\'\>]+/i', $data, $media);
unset($data);
$data=preg_replace('/(img|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
foreach($data as $url)
{
	$info = pathinfo($url);
	if (isset($info['extension']))
	{
		if (($info['extension'] == 'jpg') || 
		($info['extension'] == 'jpeg') || 
		($info['extension'] == 'gif') || 
		($info['extension'] == 'png'))
		array_push($images, $url);
	}
}
SUNWEB EXPERT