I'm having issues with connecting to the Internet using python.
I am on a corporate network that uses a PAC file to set proxies. Now this would be fine if I could find and parse the PAC to get what I need but I cannot.
The oddity:
R can connect to the internet to download files through wininet and .External(C_download,...) so I know it is possible and when I do:
import ctypes
wininet = ctypes.windll.wininet
flags = ctypes.wintypes.DWORD()
connected = wininet.InternetGetConnectedState(ctypes.byref(flags), None)
print(connected, hex(flags.value))
I get: 1 0x12 so I have a connection available but once I try to use other functions from within wininet I'm constantly met with error functions like:
AttributeError: function 'InternetCheckConnection' not found
and this goes for pretty much any other function of wininet, but this doesn't surprise me as the only named function in dir(wininet) is InternetGetConnectedState.
The wininet approach can clearly work, but I have no idea how to proceed with it [especially given that I only use Windows in work].
First, I would strongly suggest to install the
requests
module. Doing HTTP without it on Python is pretty painful.According to this answer you need to download
wpad.dat
from the hostwpad
. That is a text file that contains the proxy address.Once you know the proxy settings, you can configure
requests
to use them:"ok, so poor wording - let's just change that to: open a connection to a web page and obtain its content using python "
Sounds like you actually need BeautifulSoup and Requests. Here's a quick example of them being used to explore a webpage