- 20 Mar 2024
- 3 Minutes to read
- PDF
Commonly Used Regex
- Updated on 20 Mar 2024
- 3 Minutes to read
- PDF
Overview
Regular Expressions also referred to as Regex, are a sequence of characters that attempts to identify patterns. Regex is commonly used to perform validations such as email or phone number validation in a set of data.
Regex Rules
There are a few categories of characters, operators, or constructs that define regular expressions. Provided are the categories referenced in the .Net regex engine. Use the following categories to build your regular expression.
Character Classes
A character class matches any one of a set of characters.
Character Class | Description | Example Pattern | Matches |
---|---|---|---|
[character_group] | A case-sensitive match of any single character in character_group. | [ah] | "a" in "ball" "a", "h" in "watch" |
[^character_group] | Matches any single character that is not in character_group. Case-sensitive. | [^ba] | "o", "r", "d" in "board" |
[first - last] | Matches any single character in the range from first to last. | [A-Z] | "A", "B" in "ABC123" |
. | Wildcard that matches any single character except \n (new line). To match a literal period, escape the character ( \. ). | b.e | "bje" in "object" "ble" in "noble" |
\p{name} | Matches any single character in the Unicode general category or named block specified by name. | \p{Lu} | "S", "A" in "Swiss Alps" |
\P{name} | Matches any single character that is not in the Unicode general category or named block specified by name. | \P{Lu} | "l", "p", "s" in "Alps" |
\w | Matches any word character. | \w | "A", "B", "C", "2", "5" in "ABC 2.5" |
\W | Matches any non-word character. | \W | " ", "." in "ABC 2.5" |
\s | Matches any white-space character. | \w\s | "C " in "ABC 2.5" |
\S | Matches any non-white-space character. | \s\S | " 2" in "ABC 2.5" |
\d | Matches any decimal digit. | \d | "3" in "3 toys" |
\D | Matches any character other than a decimal digit. | \D | " ", "t", "o", "y", "s" in "3 toys" |
Quantifiers
A quantifier specifies how many instances of a previous element (character, group, or character class) must be present in the string for a match.
Quantifier | Description | Example Pattern | Matches |
---|---|---|---|
* | Matches the previous element 0 or more times. | \d*\.\d | ".0", "15.5", "415.5" |
+ | Matches the previous element 1 or more times. | "fe+" | "fe" in "fell" "fee" in "feel" |
? | Matches the previous element 0 or 1 time. | "rai?n" | "ran", "rain" |
{n} | Matches the previous element exactly n times. | ",\d{3}" | ",056" in "1,056.4"",435", ",692", ",718" in "9,435,692,718.00" |
{n,} | Matches the previous element at least n times. | "\d{2,}" | "29", "435", "2031" |
{n,m} | Matches the previous element at least n times but no more than m times. | "\d{3,5}" | "789", "5347" "90875" in "908756" |
*? | Matches the previous element 0 or more times, but as few times as possible. | \d*?\.\d | ".0", "15.5", "415.5" |
+? | Matches the previous element 1 or more times, but as few times as possible. | "\fe+?" | "fe" in "feel" "fe" in "fell" |
?? | Matches the previous element 0 or 1 time, but as few times as possible. | "rai??n" | "ran","rain" |
{n}? | Matches the previous element exactly n times. | ",\d{3}?" | ",056" in "1,056.4" ",435", ",692", ",718" in "9,435,692,718.00" |
{n,}? | Matches the previous element at least n times, but as few times as possible. | "\d{2,}?" | "29", "435", "2031" |
{n,m}? | Matches the previous element between n and m times, but as few times as possible. | "\d{3,5}?" | "435", "2031" "908", "756" in "908756" |
Anchors
Anchors cause a match to pass or fail depending on the current position in the string.
Anchor | Description | Example Pattern | Matches |
---|---|---|---|
^ | The match must start at the beginning of the string, or the beginning of the line. | ^\d{3} | "913" in "913-333-" |
$ | The match must occur at the end of the string or before the new line at the end of the string. Or it must occur before the end of the line or before the new line at the end of the line. | -\d{3}$ | "333" in "-913-333" |
\A | The match must occur at the start of the string. | \A\d{3} | "913" in "913-333-" |
\Z | The match must occur at the end of the string or before a new line at the end of the string. | -\d{3}\Z | "-333" in "-913-333" |
\z | The match must occur at the end of the string. | -\d{3}\z | "-333" in "-913-333" |
\G | The match must occur at the point where the previous match ended. | \G\(\d\) | "(1)", "(3)", "(5)" in "(1)(3)(5)[7](9)" |
\b | The match must occur on a boundary between a \w (alphanumeric) and a \W (non-alphanumeric) character. | \bscheme\b | "scheme" in "That was a clever scheme" |
\B | The match must not occur on a \b boundary. | \Band\w*\b | "ands", "ander" in "and sands android lander" |
Escaped Characters
Escaped Character | Description |
---|---|
\a | Matches a bell character |
\t | Matches a tab |
\r | Matches a carriage return |
\v | Matches a vertical tab |
\f | Matches a form feed |
\n | Matches a new line |
\e | Matches an escape |
\ | When followed by a character that's not an escaped character,'\' is the literal match of the character. |
Alternation
Alternation modifies a regular expression to enable either/or matching.
Alternation | Description |
---|---|
a|b | Matches either a or b |
(?(exp)yes|no) | yes if exp is matched; no if exp isn't matched |
(?(name)yes|no) | yes if name is matched; no if name isn't matched |
9 Commonly Used Regex
Copy and paste the following regular expressions into Validatar to search for patterns in your data. Learn more about using regex with Validatar here.
1. Digits
- Whole Numbers: ^\d+$
- Decimal Numbers: ^\d*\.\d+$
- Whole + Decimal Numbers: ^[\-+]?\d*(\.\d+)?$
- Negative, Positive Whole + Decimal Numbers: ^-?\d*(\.\d+)?$
- Whole + Decimal + Fractions: [-]?[0-9]+[,.]?[0-9]*([\/][0-9]+[,.]?[0-9]*)*
2. Alphanumeric Characters
- Alphanumeric without space: ^[a-zA-Z0-9]+$
- Alphanumeric with space: ^[a-zA-Z0-9 ]*$
3. Email
- Common Email Address: ^([a-zA-Z0-9._%-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6})*$
4. Dates
- Format YYYY-MM-dd: ^(19|20)\d\d([- /.])(0[1-9]|1[012])\2(0[1-9]|[12][0-9]|3[01])$
- Format MM-dd-YYYY: ^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d$
- Format dd-MM-YYYY: ^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d$
5. Time
- Format HH:MM 12-hour, optional leading 0: ^(0?[1-9]|1[0-2]):[0-5][0-9]$
- Format HH:MM 12-hour, optional leading 0, AM/PM: ^((1[0-2]|0?[1-9]):([0-5][0-9]) ?([AaPp][Mm]))$
- HH:MM 24-hour with leading 0: ^(0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$
- HH:MM 24-hour, optional leading 0: ^([0-9]|0[0-9]|1[0-9]|2[0-3]):[0-5][0-9]$
- HH:MM:SS 24-hour: ^(?:[01]\d|2[0123]):(?:[012345]\d):(?:[012345]\d)$
6. Match Duplicates
- Search Duplicates: (\b\w+\b)(?=.*\b\1\b)
7. Phone Numbers
- US and International Phone Numbers with Separators: ^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$
- US Phone Numbers Only with Separators: ^(\+0?1\s)?\(?\d{3}\)?[\s.-]\d{3}[\s.-]\d{4}$
- Unformatted Phone Numbers: ^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$
8. Identification
- Social Security Number with dashes: ^(?!219-09-9999|078-05-1120)(?!666|000|9\d{2})\d{3}-(?!00)\d{2}-(?!0{4})\d{4}$
- Social Security Number without dashes: ^(?!219099999|078051120)(?!666|000|9\d{2})\d{3}(?!00)\d{2}(?!0{4})\d{4}$
9. Zip Codes
- US Postal Code: ^\d{5}([\-]?\d{4})?$