Cambia Research - Supporting the Microsoft .NET Developer Community Supporting the Microsoft .NET Developer Community  

     | Home  | Articles  | Categories  | Coders  | Search  | Submit  | Contact Us    
It's hard enough to find an error in your code when you're looking for it; it's even harder when you've assumed your code is error-free. --Steve McConnell

Share Your Knowledge! -- Create and submit your articles the easy way with WebWriter.

Updated:03:01 AM CT Jan 10, 2007
Posted:10:44 PM CT Jan 09, 2007

Parsing URLs with Regular Expressions and the Regex Object

And the Anatomy of a URI (Uniform Resource Identifier)

Author: Steve Lautenschlager

SnippetRegular ExpressionsText and StringsC#.NETBrowsersWeb
    Prev     1    2     Next  

 Example: Regular Expressions for Parsing URIs and URLs

OK, we're finally here. The following method may be copied into the code behind file of your aspx page. Ensure there is a Label named lblOutput on your aspx page and call the TestParseURL method.

Example: Parse a URL with C# Regex

public void TestParseURL()
{
   string url = "http://www.cambiaresearch.com"
      + "/Cambia3/snippets/csharp/regex/uri_regex.aspx?id=17#authority";

   string regexPattern = @"^(?<s1>(?<s0>[^:/\?#]+):)?(?<a1>" 
      + @"//(?<a0>[^/\?#]*))?(?<p0>[^\?#]*)" 
      + @"(?<q1>\?(?<q0>[^#]*))?" 
      + @"(?<f1>#(?<f0>.*))?";

   Regex re = new Regex(regexPattern, RegexOptions.ExplicitCapture); 
   Match m = re.Match(url);

   lblOutput.Text = "<b>URL: " + url + "</b><p>";

   lblOutput.Text +=
      m.Groups["s0"].Value + "  (Scheme without colon)<br>"; 
   lblOutput.Text +=
      m.Groups["s1"].Value + "  (Scheme with colon)<br>"; 
   lblOutput.Text +=  
      m.Groups["a0"].Value + "  (Authority without //)<br>"; 
   lblOutput.Text +=  
      m.Groups["a1"].Value + "  (Authority with //)<br>"; 
   lblOutput.Text +=  
      m.Groups["p0"].Value + "  (Path)<br>"; 
   lblOutput.Text +=  
      m.Groups["q0"].Value + "  (Query without ?)<br>"; 
   lblOutput.Text +=  
      m.Groups["q1"].Value + "  (Query with ?)<br>"; 
   lblOutput.Text +=  
      m.Groups["f0"].Value + "  (Fragment without #)<br>"; 
   lblOutput.Text += 
      m.Groups["f1"].Value + "  (Fragment with #)<br>"; 


}
The following is the output you should see on your aspx page when you run the above method.

Example: Output

URL: http://www.cambiaresearch.com/Cambia3/snippets/csharp/
      regex/uri_regex.aspx?id=17#authority

http (Scheme without colon)
http: (Scheme with colon)
www.cambiaresearch.com (Authority without //)
//www.cambiaresearch.com (Authority with //)
/Cambia3/snippets/csharp/regex/uri_regex.aspx (Path)
id=17 (Query without ?)
?id=17 (Query with ?)
authority (Fragment without #)
#authority (Fragment with #)
    Prev     1    2     Next  

Add New Comment
Parsing URLs with Regular Expressions and the Regex Object
wmhogg21 May 08, 12:43Reply 
Parsing URLs with Regular Expressions and the Regex Object
Peter29 May 08, 3:43Reply 
CR Comments by Cambia Research
advertisement
 
Steve Lautenschlager (steve)
Steve founded Cambia Research to augment his .NET hobby. Developing and maintaining the site combines his interests in technology, writing and education.


 
Copyright © Cambia Research 2002-2007. All Rights Reserved. steve [ at ] cambiaresearch.com