How to use proxy in puppeteer and headless Chrome?

2020-05-26 10:12发布

Please tell me how to properly use a proxy with a puppeteer and headless Chrome. My option does not work.

const puppeteer = require('puppeteer');
(async () => {
  const argv = require('minimist')(process.argv.slice(2));

  const browser = await puppeteer.launch({args: ["--proxy-server =${argv.proxy}","--no-sandbox", "--disable-setuid-sandbox"]});
  const page = await browser.newPage();

  await page.setJavaScriptEnabled(false);
  await page.setUserAgent(argv.agent);
  await page.setDefaultNavigationTimeout(20000);
  try{
  await page.goto(argv.page);

  const bodyHTML = await page.evaluate(() => new XMLSerializer().serializeToString(document))
  body = bodyHTML.replace(/\r|\n/g, '');
  console.log(body);
}catch(e){
        console.log(e);
}
  await browser.close();
})();

5条回答
Bombasti
2楼-- · 2020-05-26 10:36

You can find an example about proxy at here

'use strict';

const puppeteer = require('puppeteer');

(async() => {
  const browser = await puppeteer.launch({
    // Launch chromium using a proxy server on port 9876.
    // More on proxying:
    //    https://www.chromium.org/developers/design-documents/network-settings
    args: [ '--proxy-server=127.0.0.1:9876' ]
  });
  const page = await browser.newPage();
  await page.goto('https://google.com');
  await browser.close();
})();
查看更多
Fickle 薄情
3楼-- · 2020-05-26 10:40

if you want to use different proxy for per page, try this, use https-proxy-agent or http-proxy-agent to proxy request for per page

查看更多
SAY GOODBYE
4楼-- · 2020-05-26 10:41

You can use https://github.com/gajus/puppeteer-proxy to set proxy either for entire page or for specific requests only, e.g.

import puppeteer from 'puppeteer';
import {
  createPageProxy,
} from 'puppeteer-proxy';

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  const pageProxy = createPageProxy({
    page,
    proxyUrl: 'http://127.0.0.1:3000',
  });

  await page.setRequestInterception(true);

  page.once('request', async (request) => {
    await pageProxy.proxyRequest(request);
  });

  await page.goto('https://example.com');
})();

To skip proxy simply call request.continue() conditionally.

Using puppeteer-proxy Page can have multiple proxies.

查看更多
手持菜刀,她持情操
5楼-- · 2020-05-26 10:50

It's possible with puppeteer-page-proxy. It supports setting a proxy for an entire page, or if you like, it can set a different proxy for each request. And yes, it works both in headless and headful Chrome.

First install it:

npm i puppeteer-page-proxy

Then require it:

const useProxy = require('puppeteer-page-proxy');

Using it is easy; Set proxy for an entire page:

await useProxy(page, 'http://127.0.0.1:8000');

If you want a different proxy for each request,then you can simply do this:

await page.setRequestInterception(true);
page.on('request', req => {
    useProxy(req, 'socks5://127.0.0.1:9000');
});

Then if you want to be sure that your page's IP has changed, you can look it up;

const data = await useProxy.lookup(page);
console.log(data.ip);

It supports http, https, socks4 and socks5 proxies, and it also supports authentication if that is needed:

const proxy = 'http://login:pass@127.0.0.1:8000'

Repository: https://github.com/Cuadrix/puppeteer-page-proxy

查看更多
可以哭但决不认输i
6楼-- · 2020-05-26 10:52

do not use

"--proxy-server =${argv.proxy}"  

this is a normal string instead of template literal
use ` instead of "

`--proxy-server =${argv.proxy}`

otherwise argv.proxy will not be replaced
check this string before you pass it to launch function to make sure it's correct and you may want to visit http://api.ipify.org/ in that browser to make sure the proxy works normally

查看更多
登录 后发表回答