Puppeteer Crawler - Error: net::ERR_TUNNEL_CONNECT

2019-06-13 03:35发布

问题:

Currently I have my Puppeteer running with a Proxy on Heroku. Locally the proxy relay works totally fine. I however get the error Error: net::ERR_TUNNEL_CONNECTION_FAILED. I've set all .env info in the Heroku config vars so they are all available.

Any idea how I can fix this error and resolve the issue?

I currently have

 const browser = await puppeteer.launch({
      args: [
      "--proxy-server=https=myproxy:myproxyport",
      "--no-sandbox",
      '--disable-gpu',
      "--disable-setuid-sandbox",
      ],
      timeout: 0,
      headless: true,
    });

回答1:

page.authentication

The correct format for proxy-server argument is,

--proxy-server=HOSTNAME:PORT

If it's HTTPS proxy, you can pass the username and password using page.authenticate before even doing a navigation,

page.authenticate({username:'user', password:'password'});

Complete code would look like this,

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
        headless:false,
        ignoreHTTPSErrors:true,
        args: ['--no-sandbox','--proxy-server=HOSTNAME:PORT']
  });
  const page = await browser.newPage();

  // Authenticate Here 
  await page.authenticate({username:user, password:password});
  await page.goto('https://www.example.com/');
})();

Proxy Chain

If somehow the authentication does not work using above method, you might want to handle the authentication somewhere else.

There are multiple packages to do that, one is proxy-chain, with this, you can take one proxy, and use it as new proxy server.

The proxyChain.anonymizeProxy(proxyUrl) will take one proxy with username and password, create one new proxy which you can use on your script.

const puppeteer = require('puppeteer');
const proxyChain = require('proxy-chain');

(async() => {
    const oldProxyUrl = 'http://username:password@hostname:8000';
    const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);

    // Prints something like "http://127.0.0.1:12345"
    console.log(newProxyUrl);

    const browser = await puppeteer.launch({
        args: [`--proxy-server=${newProxyUrl}`],
    });

    // Do your magic here...
    const page = await browser.newPage();
    await page.goto('https://www.example.com');
})();