Make a “Web-Proxy” - Step By Step

2019-09-21 07:39发布

问题:

Is there any way to show another page into your page? I cannot use frames, because frame will open that page directly, I want to copy the whole page and save it into new file, then show my new file to user. I think it is better to do this using a simple URL encryption because I don't like to show the real page address. For example, I want to use below URL instead of yahoo.com: www.myDomain.com/Open.aspx?url=zbipp_dpn ... I know how to read, encrypt and decrypt URL, but my problem is I don't know how to copy that page into new page and how to show that.

EDIT: I don't know how to start research even I don't know what should I looking for. This is the reason why I am asking my question here, from experts. I need a keyword to start research!

回答1:

It sounds like you are trying to setup a proxy.

You could do the following:

  • Listen for requests using an HTTP handler. This can be an MVC controller, a web form (ASPX), an instance of IHttpHandler, even a raw TCP server.

  • Use the encrypted URL to determine the target website.

  • Make a request from your website to the other website. There are multiple ways to do this in .Net, including the HttpClient class.

  • Convert the response to a string.

  • (Optional) parse links in the content to point to your proxy website address instead of the real address.

  • Return the string to the caller as the body of the response. As far as their browser knows, it is the page they requested.

Disclaimer: While proxies are commonly used, there are potential implications (beyond my non-legal knowledge and advice) to presenting someone else's content under a different URL. In addition, there may be other (perhaps serious) ramifications to circumventing filtered content. Proxied content even with the modified URL may still trigger a filter.



回答2:

Well, finally I started creating a web proxy.

I decided to explain my work here for two reasons: 1) For everyone who wants to start a similar project. 2) Most parts of these codes are copied from Stack pages, I've just collected them. ;)

I need experts to correct my mistakes and help me to continue.

Here is what I did:


ASP (Default.aspx):

I put a textbox named "txtURL" to enter the web address by user.

I put a button named "btnRun" to start processing.

For now, these components are enough!


C#:

Clicking on "btnRun", makes the page redirecting to: "www.domain.com/default.aspx?URL=(xxx)" - xxx will be replaced by web page address encrypted by a function.

This is the code for btnRun_Click:

protected void btnRun_Click(object sender, EventArgs e)
    {
        if (txtURL.Text.Length == 0) return;
        if (!(txtURL.Text.ToLower().StartsWith("http://") || txtURL.Text.ToLower().StartsWith("https://")))
            txtURL.Text = "http://" + txtURL.Text;

        try
        {
            Response.Redirect("Default.aspx?URL=(" + Encrypt(txtURL.Text, mainKey) + ")", false);
        }
        catch (Exception ex)
        {
            ShowPopUpMsg(ex.Message);
        }

I'll explain "Encrypt" and "ShowPopUpMsg" functions later.

By clicking on "btnRun", this page will be refreshed and the encrypted URL will be included in the address.

Now, in "Page_Load", we should read the encrypted URL (also a condition to detect postback):

protected void Page_Load(object sender, EventArgs e)
    {
        string url = Regex.Match(HttpContext.Current.Request.Url.AbsoluteUri, @"\(([^)]*)\)").Groups[1].Value;
        if (url.Length == 0 || Page.IsPostBack) return;

From now, every code is added to "Page_Load", one after other.

Decrypt the URL and read the remote web page source-code:

try
        {
            txtURL.Text = Server.UrlDecode(Decrypt(url, mainKey));
            string TheUrl = txtURL.Text;
            string response = GetHtmlPage(TheUrl);

I'll explain "Decrypt" and "GetHtmlPage" later.

Now, we have the source-code in "response".

Next step is find the links in this source-code. Begining of the links is "href="xxx"" and xxx is the link. We must replace them with our links through the proxy:

            response = response.Replace("href =", "href=");
            response = response.Replace("href\n=", "href=");
            response = response.Replace("href\t=", "href=");

            HtmlWeb hw = new HtmlWeb();
            HtmlDocument doc = hw.Load(txtURL.Text);
            foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
            {
                char[] c = { ' ', '\"' };
                string s = link.OuterHtml;
                int from = s.IndexOf("href=");
                int to = SearchString(s, from, '\"');

                s = s.Substring(from + 5, to - from - 5);
                s.TrimStart(c);
                if (s.StartsWith("\"")) s = s.Remove(0, 1);

"SearchString" is a function to return the closing quotation mark of "href". I'll explain this later.

There are two kind of links:

  1. Links that refer to another domain-name. This links are begun with "http://" or "https://". We'll find them and replace the address:

                string corrected = "href=\"" + "Default.aspx?URL=(" + Encrypt(s, mainKey) + ")" + "\"";
                if ((s.ToLower().StartsWith("http://") || s.ToLower().StartsWith("https://")))
                    response = response.Replace("href=\"" + s + "\"", corrected);
    
  2. Link that refer to current domain-name. This links are begun with "/". To replace them, we should first find the domain name then the whole address:

                else
                {
                    var uri = new Uri(txtURL.Text);
                    string domain = uri.GetComponents(UriComponents.Host, UriFormat.SafeUnescaped);
                    corrected = "href=\"" + "Default.aspx?URL=(";
                    if (txtURL.Text.ToLower().StartsWith("http://")) corrected += Encrypt("http://" + domain + s, mainKey);
                    if (txtURL.Text.ToLower().StartsWith("https://")) corrected += Encrypt("https://" + domain + s, mainKey);
                    corrected += ")" + "\"";
                    response = response.Replace("href=\"" + s + "\"", corrected);
                }
    

Now, everything is done (refer to my current knowledge) and we should show the page with new links and finish "Page_Load":

            }
            Response.Write(response);                
        }
        catch (Exception ex)
        {
            ShowPopUpMsg(ex.Message);
        }
    }

Function to search in a string:

private int SearchString(string mainString, int startLocation, char charToFind)
    {
        if (startLocation < 0) return -1;
        bool next = false;
        for (int i = startLocation; i < mainString.Length; i++)
            if (mainString.Substring(i, 1) == charToFind.ToString() && next)
                return i;
            else
            {
                if (mainString.Substring(i, 1) == charToFind.ToString()) next = true;
                continue;
            }
        return -1;
    }

Function to read source-code:

private string GetHtmlPage(string URL)
        {
            String strResult;
            WebResponse objResponse;
            WebRequest objRequest = HttpWebRequest.Create(URL);
            objResponse = objRequest.GetResponse();
            using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
            {
                strResult = sr.ReadToEnd();
                sr.Close();
            }
            return strResult;
        }

Function to show a popup message:

private void ShowPopUpMsg(string msg)
        {
            StringBuilder sb = new StringBuilder();
            sb.Append("alert('");
            sb.Append(msg.Replace("\n", "\\n").Replace("\r", "").Replace("'", "\\'"));
            sb.Append("');");
            ScriptManager.RegisterStartupScript(this.Page, this.GetType(), "showalert", sb.ToString(), true);
        }

Function to decrypt a string:

private string Decrypt(string s, string key)
        {
            try
            {
                byte[] keyArray; byte[] toEncryptArray = Convert.FromBase64String(s);
                System.Configuration.AppSettingsReader settingsReader = new System.Configuration.AppSettingsReader();
                MD5CryptoServiceProvider hashmd5 = new MD5CryptoServiceProvider();
                keyArray = hashmd5.ComputeHash(UTF8Encoding.UTF8.GetBytes(key)); hashmd5.Clear();
                TripleDESCryptoServiceProvider tdes = new TripleDESCryptoServiceProvider();
                tdes.Key = keyArray; tdes.Mode = CipherMode.ECB; tdes.Padding = PaddingMode.PKCS7;
                ICryptoTransform cTransform = tdes.CreateDecryptor();
                byte[] resultArray = cTransform.TransformFinalBlock(toEncryptArray, 0, toEncryptArray.Length);
                tdes.Clear(); return UTF8Encoding.UTF8.GetString(resultArray);
            }
            catch { return null; }
        }

Function to encrypt a string:

private string Encrypt(string s, string key)
    {
        try
        {
            byte[] keyArray; byte[] encryptArray = UTF8Encoding.UTF8.GetBytes(s);
            System.Configuration.AppSettingsReader SettingReader = new System.Configuration.AppSettingsReader();
            MD5CryptoServiceProvider Hashmd5 = new MD5CryptoServiceProvider();
            keyArray = Hashmd5.ComputeHash(UTF8Encoding.UTF8.GetBytes(key)); Hashmd5.Clear();
            TripleDESCryptoServiceProvider Tdes = new TripleDESCryptoServiceProvider();
            Tdes.Key = keyArray; Tdes.Mode = CipherMode.ECB; Tdes.Padding = PaddingMode.PKCS7;
            ICryptoTransform Ctransform = Tdes.CreateEncryptor();
            byte[] resultarray = Ctransform.TransformFinalBlock(encryptArray, 0, encryptArray.Length);
            Tdes.Clear(); return Convert.ToBase64String(resultarray, 0, resultarray.Length);
        }
        catch { return null; }
    }