脚本宝典收集整理的这篇文章主要介绍了使用 WebBrowser 获取Ajax动态加载网页信息,脚本宝典觉得挺不错的,现在分享给大家,也给大家做个参考。
直接上代码(代码较粗糙,可根据需要优化):
Webbrowser 直接执行时会报一个单线程的问题,我的解决方法是:使用“stathread”,指定线程模型为单线程单元
[STAThread] static void Main(string[] args)
using System; using System.IO; using System.Net; using System.Text; using System.Windows.Forms; using System.Text.RegularExPressions; using System.Collections.SPEcialized; namespace Crawlertest { public class HttpHelper { /// <summary> /// 下载Ajax HtML /// </summary> /// <param name="url"></param> /// <returns></returns> public static string DownloadAjaxHtml(string url) { string htmlstr = null; try { WebBrowser wb = new WebBrowser(); wb.AllowNavigation = true; wb.ScriptErrorsSupPRessed = true; int hITCount = 1; wb.Navigating += (sender, e) => { hitCount++; }; wb.Documentcompleted += (sender, e) => { hitCount++; }; wb.navigate(url); DateTime dtime = DateTime.Now; double timespan = 0; while (timespan <= 3 || wb.ReadyState != WebBrowserReadyState.COMplete) { Application.DoEvents(); DateTime time2 = DateTime.Now; timespan = (time2 - dtime).totalSeconds; } if (wb.ReadyState == WebBrowserReadyState.Complete) { htmlstr = wb.Document.Body.OuterHtml; htmlstr = System.Web.HttpUtility.UrlDecode(htmlstr);//解码 } } catch (Exception ex) { Console.WriteLine($"DownloadAjaxHtml-Error:{ex.ToString()}"); } return htmlstr; } //获取Html后再获取想要的内容 public static List<NewsHotTitle> GetHotTitle(Encoding encoding) { VAR url = "http://www.news.cn/2021homepro/rsznb/"; string strHtml = HttpHelper.DownloadAjaxHtml(url); if (string.IsNullOrEmpty(strHtml)) { Console.WriteLine($"获取数据失败"); } HtmlDocument doc = new HtmlDocument(); doc.loadhtml(strHtml); HtmlNode rootnode = doc.DocumentNode; HtmlNodeCollection hotlist = rootnode.SelectNodes("//ul[@class='htList']//li"); if (hotlist == null || !hotlist.Any()) { Console.WriteLine($"获取数据失败"); } var list = new List<NewsHotTitle>(); foreach (HtmlNode item in hotlist) { NewsHotTitle model = new NewsHotTitle(); model.Title = HttpHelper.RemoveHtml(item.InnerHtml); model.PublishTime = DateTime.Now; Console.WriteLine($"{model.ToJson()}"); } return list; } } }
以上是脚本宝典为你收集整理的使用 WebBrowser 获取Ajax动态加载网页信息全部内容,希望文章能够帮你解决使用 WebBrowser 获取Ajax动态加载网页信息所遇到的问题。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。