Use HTTPWebRequest to get remote page's title

I have a web service that acts as an interface between a farm of websites and some analytics software. Part of the analytics tracking requires harvesting the page title. Rather than passing it from the webpage to the web service, I would like to use HTTPWebRequest to call the page.

I have code that will get the entire page and parse out the html to grab the title tag but I don't want to have to download the entire page to just get information that's in the head.

I've started with

HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("url");  
request.Method = "HEAD";

标签： c# asp.net http httpwebrequest

4条回答

等我变得足够好

2楼-- · 2019-05-31 18:58

Great idea, but a HEAD request only returns the document's HTTP headers. This does not include the title element, which is part of the HTTP message body.

0人赞添加讨论(0) 举报

我想做一个坏孩纸

3楼-- · 2019-05-31 19:01

So I would have to go with something like...

HttpWebRequest req   = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
Stream st            = resp.GetResponseStream();
StreamReader sr      = new StreamReader(st);
string buffer        = sr.ReadToEnd();
int startPos, endPos;
startPos = buffer.IndexOf("&lt;title>",
StringComparison.CurrentCultureIgnoreCase) + 7;
endPos = buffer.IndexOf("&lt;/title>",
StringComparison.CurrentCultureIgnoreCase);
string title = buffer.Substring(startPos, endPos - startPos);
Console.WriteLine("Response code from {0}: {1}", s,
        resp.StatusCode);
Console.WriteLine("Page title: {0}", title);
sr.Close();
st.Close();

0人赞添加讨论(0) 举报

贪生不怕死

4楼-- · 2019-05-31 19:04

Try this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            string page = @"http://stackoverflow.com/";
            HttpWebRequest req = (HttpWebRequest)HttpWebRequest.Create(page);
            StreamReader SR = new StreamReader(req.GetResponse().GetResponseStream());

            Char[] buf = new Char[256];
            int count = SR.Read(buf, 0, 256);
            while (count > 0)
            {
                String outputData = new String(buf, 0, count);
                Match match = Regex.Match(outputData, @"<title>([^<]+)", RegexOptions.IgnoreCase);
                if (match.Success)
                {
                    Console.WriteLine(match.Groups[1].Value);
                }
                count = SR.Read(buf, 0, 256);
            }
        }

    }
}

0人赞添加讨论(0) 举报

Juvenile、少年°

5楼-- · 2019-05-31 19:06

If you don't want to request the entire page, you can request it in pieces. The http spec defines a http header called Range. You would use it like below:

Range: bytes=0-100

You can look through the returned content and find the title. If it is not there, then request Range: 101-200 and so on until you get what you need.

Obviously, the web server needs to support range, so this may be hit or miss.

0人赞添加讨论(0) 举报

Use HTTPWebRequest to get remote page's title

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间