Tuesday, July 8, 2008

Regular Expression Groups in .NET

Regular Expressions are a great way of querying and replacing text. A while ago I stumbled upon a feature in the .NET framework for regular expressions: groups. Groups are not supported in browsers but they can be used for back end code in .NET. Breaking expressions into groups makes it easier to parse text. Here is a line of text from a standard FTP directory listing:

string directoryLine= "drwxr-xr-- dds grp 0 Feb 23 2002 data";

Here is the regular expression that we can use to parse the FTP directory line:

string mask = @"^(?<dir>[\-d])(?<permission>([\-rwxt]+))\s+\d+\s+\w+\s+\w+\s+(?<size>\d+)\s+(?<timestamp>\w+\s+\d+\s+\d{1,2}:\d{2})\s+(?<name>.+)"

The group names are prefixed by a question mark and then the group name in <name>.

Some example code to pull out the groups:

Regex regEx = new Regex(mask);
Match match = regEx.Match(directoryLine);

if (match.Success)
{
string fileName= match.Groups["name"].Value;

if (match.Groups["dir"].Value == "d")
{
//Do Something
}
}

No comments: