π A Practical Guide to Regular Expressions (RegEx) In JavaScript
π¨βπ» By Sukhjinder Arora
β Practise
β¦ References
Regular expressions are a way to describe patterns in a string data.
Creating a Regular Expression
There are two ways to create a regular expression in Javascript. It can be either created with RegExp constructor, or by using forward slashes ( / ) to enclose the pattern.
Regular Expression Constructor:
Syntax: new RegExp(pattern[, flags])
Example:
var regexConst = new RegExp("abc");Regular Expression Literal:
Syntax: /pattern/flags
Example:
var regexLiteral = /abc/;Note: Here the flags are optional
Since forward slashes are used to enclose patterns in the above example, you have to escape the forward slash ( / ) with a backslash ( \ ) if you want to use it as a part of the regex.
Regular Expressions Methods
RegExp.prototype.test()
This method is used to test whether a match has been found or not. It accepts a string which we have to test against regular expression and returns true or false depending upon if the match is found or not.
For example:
var regex = /hello/;
var str = "hello world";
var result = regex.test(str);
console.log(result);
// returns trueRegExp.prototype.exec()
This method returns an array containing all the matched groups. It accepts a string that we have to test against a regular expression.
For example:
var regex = /hello/;
var str = "hello world";
var result = regex.exec(str);
console.log(result);
// returns [ 'hello', index: 0, input: 'hello world', groups: undefined ]
// 'hello' -> is the matched pattern.
// index: -> Is where the regular expression starts.
// input: -> Is the actual string passed.Flags:
Regular expressions have five optional flags or modifiers. Letβs discuss the two most important flags:
g β Global search, donβt return after the first matchi β Case-insensitive searchRegular Expression Literal
Syntax /pattern/flags
var regexGlobal = /abc/g;
console.log(regexGlobal.test("abc abc"));
// it will match all the occurence of 'abc', so it won't return
// after first match.
var regexInsensitive = /abc/i;
console.log(regexInsensitive.test("Abc"));
// returns true, because the case of string characters don't matter
// in case-insensitive search.Regular Expression Constructor
Syntax new RegExp('pattern', 'flags')
var regexGlobal = new RegExp("abc", "g");
console.log(regexGlobal.test("abc abc"));
// it will match all the occurence of 'abc', so it won't return // after first match.
var regexInsensitive = new RegExp("abc", "i");
console.log(regexInsensitive.test("Abc"));
// returns true, because the case of string characters don't matter // in case-insensitive search.Character groups:
Character set [xyz]
A character set is a way to match different characters in a single position, it matches any single character in the string from characters present inside the brackets.
For example:
var regex = /[bt]ear/;
console.log(regex.test("tear"));
// returns true
console.log(regex.test("bear"));
// return true
console.log(regex.test("fear"));
// return falseNote β All the special characters except for caret (^) (Which has entirely different meaning inside the character set) lose their special meaning inside the character set.
Negated character set [^xyz]
It matches anything that is not enclosed in the brackets.
For example:
var regex = /[^bt]ear/;
console.log(regex.test("tear"));
// returns false
console.log(regex.test("bear"));
// return false
console.log(regex.test("fear"));
// return trueRanges [a-z]
Suppose we want to match all of the letters of an alphabet in a single position, we could write all the letters inside the brackets, but there is an easier way and that is ranges.
For example:
[a-h] will match all the letters from a to h. Ranges can also be digits like [0-9] or capital letters like [A-Z].
var regex = /[a-z]ear/;
console.log(regex.test("fear"));
// returns true
console.log(regex.test("tear"));
// returns trueMeta-characters
Meta-characters are characters with a special meaning. There are many meta character but I am going to cover the most important ones here.
\d β Match any digit character ( same as [0-9] ).\w β Match any word character. A word character is any letter, digit, and underscore. (Same as [a-zA-Z0β9_] ) i.e alphanumeric character.\s β Match a whitespace character (spaces, tabs etc).\t β Match a tab character only.\b β Find a match at beginning or ending of a word. Also known as word boundary.. β (period) Matches any character except for newline.\D β Match any non digit character (same as 0β9).\W β Match any non word character (Same as a-za-z0β9_ ).\S β Match a non whitespace character.Quantifiers:
Quantifiers are symbols that have a special meaning in a regular expression.
+ β Matches the preceding expression 1 or more times.
var regex = /8\d+/;
console.log(regex.test("8"));
// true
console.log(regex.test("88899"));
// true
console.log(regex.test("8888845"));
// true* - Matches the preceding expression 0 or more times.
var regex = /go\*d/;
console.log(regex.test("gd"));
// true
console.log(regex.test("god"));
// true
console.log(regex.test("good"));
// true
console.log(regex.test("goood"));
// true? β Matches the preceding expression 0 or 1 time, that is preceding pattern is optional.
var regex = /goo?d/;
console.log(regex.test("god"));
// true
console.log(regex.test("good"));
// true
console.log(regex.test("goood"));
// false^ β Matches the beginning of the string, the regular expression that follows it should be at the start of the test string. i.e the caret (^) matches the start of string.
var regex = /^g/;
console.log(regex.test("good"));
// true
console.log(regex.test("bad"));
// false
console.log(regex.test("tag"));
// false$ β Matches the end of the string, that is the regular expression that precedes it should be at the end of the test string. The dollar ($) sign matches the end of the string.
var regex = /.com$/;
console.log(regex.test("test@testmail.com"));
// true
console.log(regex.test("test@testmail"));
// false{N} β Matches exactly N occurrences of the preceding regular expression.
var regex = /go{2}d/;
console.log(regex.test("good"));
// true
console.log(regex.test("god"));
// false{N,} β Matches at least N occurrences of the preceding regular expression.
var regex = /go{2,}d/;
console.log(regex.test("good"));
// true
console.log(regex.test("goood"));
// true
console.log(regex.test("gooood"));
// true{N,M} β Matches at least N occurrences and at most M occurrences of the preceding regular expression (where M > N).
var regex = /go{1,2}d/;
console.log(regex.test("god"));
// true
console.log(regex.test("good"));
// true
console.log(regex.test("goood"));
// falseAlternation X|Y β Matches either X or Y.
var regex = /(green|red) apple/;
console.log(regex.test("green apple"));
// true
console.log(regex.test("red apple"));
// true
console.log(regex.test("blue apple"));
// falseNote β If you want to use any special character as a part of the expression, say for example you want to match literal + or ., then you have to escape them with backslash ( \ ).
For example:
var regex = /a+b/; // This won't work
var regex = /a\+b/; // This will work
console.log(regex.test("a+b")); // trueAdvanced
(x) β Matches x and remembers the match. These are called capturing groups. This is also used to create sub expressions within a regular expression.
var regex = /(foo)bar\1/;
console.log(regex.test("foobarfoo"));
// true
console.log(regex.test("foobar"));
// falseNote: \1 - remembers and uses that match from first subexpression within parentheses.
(?:x) β Matches x and does not remember the match. These are called non capturing groups. Here \1 wonβt work, it will match the literal \1.
var regex = /(?:foo)bar\1/;
console.log(regex.test("foobarfoo"));
// false
console.log(regex.test("foobar"));
// false
console.log(regex.test("foobar\1"));
// truex(?=y) β Matches x only if x is followed by y. Also called positive look ahead.
var regex = /Red(?=Apple)/;
console.log(regex.test("RedApple"));
// true⨠Practicing Regex:
Match any 10 digit number :
var regex = /^\d{10}$/;
console.log(regex.test("9995484545"));
// trueLetβs break that down and see whatβs going on up there.
^ and $. The caret ^ matches the start of the input string, whereas the dollar sign $ matches the end. So it would not match if string contain more than 10 digits.\d matches any digit character.{10} matches the previous expression, in this case \d exactly 10 times. So if the test string contains less than or more than 10 digits, the result will be false.Match a date with following format DD-MM-YYYY or DD-MM-YY
var regex = /^(\d{1,2}-){2}\d{2}(\d{2})?$/;
console.log(regex.test("01-01-1990"));
// true
console.log(regex.test("01-01-90"));
// true
console.log(regex.test("01-01-190"));
// falseLetβs break that down and see whatβs going on up there.
^ and \$, so that the match spans entire string.( start of first subexpression.\d{1,2} matches at least 1 digit and at most 2 digits.- matches the literal hyphen character.) end of first subexpression.{2} match the first subexpression exactly two times.\d{2} matches exactly two digits.(\d{2})? matches exactly two digits. But itβs optional, so either year contains 2 digits or 4 digits.Matching Anything But a Newline
The expression should match any string with a format like abc.def.ghi.jkl where each variable a, b, c, d, e, f, g, h, i, j, k, l can be any character except new line.
var regex = /^(.{3}\.){3}.{3}$/;
console.log(regex.test("123.456.abc.def"));
// true
console.log(regex.test("1243.446.abc.def"));
// false
console.log(regex.test("abc.def.ghi.jkl"));
// trueLetβs break that down and see whatβs going on up there.
^ and \$, so that the match spans entire string.( start of first sub expression.{3} matches any character except new line for exactly 3 times.\. matches the literal . period) end of first sub expression{3} matches the first sub expression exactly 3 times..{3} matches any character except new line for exactly 3 times.