I then reached the point where I had some raw HTML in a string that I needed to parse. After doing a quick search I found CsQuery (available as a NuGet package) which is an open source JQuery port for .NET. I was able to easily extract the data I required from the HTML using the familiar JQuery-like selectors. There is an example code snippet below which shows just how easy it is to use CsQuery.
var html = new StringBuilder();
html.Append("<html><body>");
html.Append("<h1>Hello, world!</h1>");
html.Append("<p class='intro'>This program is using CsQuery.</p>");
html.Append("<p id='author'>CsQuery is a library written by James Treworgy.</p>");
html.Append("</body></html>");
var dom = CsQuery.CQ.Create(html.ToString());
// Get the inner text of an element by element name selector
Console.WriteLine(dom["h1"].Text());
// Get the inner text of an element by class name selector
Console.WriteLine(dom[".intro"].Text());
// Get the inner text of an element by id selector
Console.WriteLine(dom["#author"].Text());
// Add a class to an element
dom["h1"].AddClass("title");
// Update the title text using new class in selector
dom[".title"].Text("CSQuery - a JQuery port for .NET");
// Now retrieve the new title by a class selector
Console.WriteLine(dom[".title"].Text());
// Pause console
Console.ReadLine();
html.Append("<html><body>");
html.Append("<h1>Hello, world!</h1>");
html.Append("<p class='intro'>This program is using CsQuery.</p>");
html.Append("<p id='author'>CsQuery is a library written by James Treworgy.</p>");
html.Append("</body></html>");
var dom = CsQuery.CQ.Create(html.ToString());
// Get the inner text of an element by element name selector
Console.WriteLine(dom["h1"].Text());
// Get the inner text of an element by class name selector
Console.WriteLine(dom[".intro"].Text());
// Get the inner text of an element by id selector
Console.WriteLine(dom["#author"].Text());
// Add a class to an element
dom["h1"].AddClass("title");
// Update the title text using new class in selector
dom[".title"].Text("CSQuery - a JQuery port for .NET");
// Now retrieve the new title by a class selector
Console.WriteLine(dom[".title"].Text());
// Pause console
Console.ReadLine();
The example source code is available in a C# console application project on GitHub - https://github.com/rsingh85/CsQueryExample.
No comments:
Post a Comment