Examples of regular expressions

FieldWorks Language Explorer 9 Help

Click here to see this page in full context

Examples of regular expressions

Use Character	To	Examples
^ (caret)	Match at the beginning of a string.	In "abc" ^a matches the "a," but ^b finds no match.
^(?!root)	Match everything except "root"	In a Morph Type column, use this to exclude roots. Use syntax in any column, replacing 'root' with appropriate word or characters. ^(?!.noun.) with wild cards (*) excludes any item the contains 'noun'.
[^x]	Match any character except "x"	^[^ ]+$ matches any entry except those that have a space.
^[^x]	Match any string except strings that begin with "x"	^[^-] excludes all entries that begin with a hyphen, such as prefixes.
$ (dollar)	Match a character at the end of a string. Used with capture as the replacement criteria.	In "abc" c$ matches the "c," but b$ finds no match. Entering $1 in Replace with box to replace the first captured content, specifically, that found matching the expression in the first set of parenthesis ( ). You can have more than one set of parenthesis (example).
. (dot)	Match any single character, without regard to what that character is.	In "This is a nice island".is matches "This" and "island." (The dot character can get you in trouble, so use sparingly and with caution! In Unicode, the dot matches any single unicode point, but á is actually two characters.)
\| (bar)	Allow alternation.	oy\|ay\|ey\|aw to find any occurrence of diphthongs oy, ay, ey, or aw.
* (asterisk)	Match 0 or more times, as many as possible.	".*" match any number of characters as many times as possible.
+ (plus)	Match 1 or more times, as many as possible.
? (question mark)	Match zero or one time, preferably one, or make the preceding item optional.	colou?r will match both "color" and "colour."
\ (slash)	Quote the following character to make the character literal.	Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ \| \ . / Use \. to find a period, or \? find a question mark.
{n}	Match exactly n times.
[pattern]	Match any one character from the set in brackets.	c[aeiou]t matches cat, cet, cit, cot, and cut.
\b	Perform "whole words only" search, as \bword\b.	Use \bthe\b to find the word "the" in definitions or glosses. In "This is a nice island." \bis\b matches only the "is" but not "This" and "island."
\B	Match every position where \b does not.	In "This is a nice island." \Bis matches only the "is" in "This," and is\B matches only the "is" in "island."
\d	Match any digit from 0 to 9 and more (see metacharacters table).	In "1.2" \d matches both the 1 and the 2. (In Find/Replace, both may be replaced.)
\d{0,1}$	Ignore homographs numbers.	When you sort and filter on letters at the end of headwords, \d{0,1}$ helps make sure that the possible presence of a homograph number does not affect the results. Type the letters you want to find ahead of the slash. Tip: Lexeme form columns do not display the homograph numbers.
\N{unicode character name}	Match named character. (See "Tip" below.)	\N{Hyphen-Minus} matches a single Unicode code point that has the name "Hyphen-Minus." \N{Em Dash} matches a single Unicode code point that has the name "Em Dash."
\p{unicode property name}	Match any character with a specified unicode property.	\p{L} or \p{Letter} matches a single Unicode code point that has the property "letter." \p{Pd} (Punctuation dash) matches a single dash. \p{Po} (Punctuation other) matches punctuation not a dash, bracket, quote or connector. \p{N} or \p{Number} matches numbers.
\P{unicode property name}	Match any character without a specified unicode property.	\P{M} matches a code point that is not a combining mark. \P{N} or \P{Number} matches item that are not numbers.
\s	Match a "whitespace character."	Use "black\sboard" as the find string and "blackboard" as the replace criteria. Use \s to find a space in a gloss you want to replace with a period.
\w	Match word characters, as compared to non-word characters such as spaces.	\w matches very letter and digit in the string, except spaces.
\W	Match non-word characters, such as spaces.	\W matches the spaces in the string. Use to replace spaces with a period in a gloss.

Tip

Some of the above regular expressions are available for selection in the Filter for dialog box, assistance button.
FieldWorks automatically adds the forward slashes before and after the regular expression so you do not need to type them manually. For example, if you type \s into the Filter for items containing dialog box, you will see /\s/ in the column (below heading) where you set the filter.
For the \N{ } regular expression, you can use the Character Map to help identify the names of characters. The name typically appears at the bottom of the Character Map dialog box.

Examples of regular expressions

Tip

Related Topics