Use Character |
To |
Examples |
---|---|---|
^ (caret) |
Match at the beginning of a string. |
In "abc" ^a matches the "a," but ^b finds no match. |
^(?!root) |
Match everything except "root" |
In a Morph Type column, use this to exclude roots. Use syntax in any column, replacing 'root' with appropriate word or characters. |
[^x] |
Match any character except "x" |
^[^ ]+$ matches any entry except those that have a space. |
^[^x] |
Match any string except strings that begin with "x" |
^[^-] excludes all entries that begin with a hyphen, such as prefixes. |
$ (dollar) |
|
|
. (dot) |
Match any single character, without regard to what that character is. |
In "This is a nice island".is matches "This" and "island." (The dot character can get you in trouble, so use sparingly and with caution! In Unicode, the dot matches any single unicode point, but á is actually two characters.) |
| (bar) |
Allow alternation. |
oy|ay|ey|aw to find any occurrence of diphthongs oy, ay, ey, or aw. |
* (asterisk) |
Match 0 or more times, as many as possible. |
".*" match any number of characters as many times as possible. |
+ (plus) |
Match 1 or more times, as many as possible. |
|
? (question mark) |
Match zero or one time, preferably one, or make the preceding item optional. |
colou?r will match both "color" and "colour." |
\ (slash) |
Quote the following character to make the character literal. |
Characters that must be quoted to be treated as literals are * ? + [ ( ) { } ^ $ | \ . / Use \. to find a period, or \? find a question mark. |
{n} |
Match exactly n times. |
|
[pattern] |
Match any one character from the set in brackets. |
c[aeiou]t matches cat, cet, cit, cot, and cut. |
\b |
Perform "whole words only" search, as \bword\b. |
|
\B |
Match every position where \b does not. |
In "This is a nice island." \Bis matches only the "is" in "This," and is\B matches only the "is" in "island." |
\d |
Match any digit from 0 to 9 and more (see metacharacters table). |
In "1.2" \d matches both the 1 and the 2. (In Find/Replace, both may be replaced.) |
\d{0,1}$ |
Ignore homographs numbers. |
When you sort and filter on letters at the end of headwords, \d{0,1}$ helps make sure that the possible presence of a homograph number does not affect the results. Type the letters you want to find ahead of the slash.
|
\N{unicode character name} |
Match named character. (See "Tip" below.) |
|
\p{unicode property name} |
Match any character with a specified unicode property. |
|
\P{unicode property name} |
Match any character without a specified unicode property. |
|
\s |
Match a "whitespace character." |
|
\w |
Match word characters, as compared to non-word characters such as spaces. |
\w matches very letter and digit in the string, except spaces. |
\W |
Match non-word characters, such as spaces. |
\W matches the spaces in the string. Use to replace spaces with a period in a gloss. |
Some of the above regular expressions are available for selection in the Filter for dialog box, assistance button.
FieldWorks automatically adds the forward slashes before and after the regular expression so you do not need to type them manually. For example, if you type \s into the Filter for items containing dialog box, you will see /\s/ in the column (below heading) where you set the filter.
For the \N{ } regular expression, you can use the Character Map to help identify the names of characters. The name typically appears at the bottom of the Character Map dialog box.
Examples of combinations of regular expressions
Find and Replace using regular expressions (Example)
Regular expressions metacharacters table