HTML manipulation library and parser for .NETAcrux's Advanced Html Parser is a library for parsing and manipulating real world malformed HTML. It supports XPATH queries and XSLT transformations to be executed against the loaded document. Whether you want to do web scanning, web scraping, web data mining or just alter the content of pages in your content management system - the Advanced Html Parser is what you need.
CURRENT VERSION: 1.0.0.219
The trial version contains complete documentation and code samples in C# and VB.NET for:
- Saving a complete local copy of an HTML page by downloading images and scripts locally and modifying the page content to refer to the local files.
- Enumerating available web forms and their input controls in a web page, setting values of the inputs and submitting the form.
- Using X-Path 2.0 expressions.
........................................................................................
// Load a document from internet and save it as XML
Acrux.Html.HtmlDocument htmlDoc = new Acrux.Html.HtmlDocument();
htmlDoc.Load(new Uri("http://www.acruxsoftware.net/"));
htmlDoc.SaveXml(@"C:\HtmlAsXml.xml");
........................................................................................
// Run complex XPath queries against the loaded HTML document
foreach (HtmlNode node in htmlDoc.SelectNodes("//a[matches(string(./@href), 'mailto')]"))
{
Console.WriteLine(node.Attributes["href"].Value);
}
........................................................................................
// Iterate through all elements in the HEAD
foreach (HtmlNode node in htmlDoc.HeadElement.ChildNodes)
{
Console.WriteLine(node.OuterXml);
}
........................................................................................
// Update attribute values and save as HTML without causing any mark-up re-formatting
foreach (HtmlNode imgNode in htmlDoc.SelectNodes("//img"))
{
node.Attributes["src"].Value = RemapUrlToLocalPath(node.Attributes["src"].Value);
}
htmlDoc.Save(@"C:\altered.html");
........................................................................................
The trial version contains complete documentation and code samples in C# and VB.NET for:
- Saving a complete local copy of an HTML page by downloading images and scripts locally and modifying the page content to refer to the local files.
- Enumerating available web forms and their input controls in a web page, setting values of the inputs and submitting the form.
- Using X-Path 2.0 expressions.





