Python project little issue I can't seem to fi

2019-08-22 01:00发布

问题:

So, I've recently been adventuring around with python, and I've been attempting to learn a bit of things by mixing code that I find and making it into something I could end up using in the future. I've almost completely the project, although when I print out the links, it says

https://v3rmillion.net/showthread.php

Instead of being something like that I would prefer being:

https://v3rmillion.net/showthread.php?tid=393794

import requests,os,urllib,sys, webbrowser, bs4

from bs4 import BeautifulSoup

def startup():
    os.system('cls')
    print('Discord To Profile')
    user = raw_input('Discord Tag: ')
    r = requests.get('https://www.google.ca/search?source=hp&q=' + user + ' site:v3rmillion.net')
    soup = BeautifulSoup(r.text, "html.parser")
    print soup.find('div',{'id':'resultStats'}).text

    #This part below is where I'm having the issue.
    content=r.content.decode('UTF-8','replace')
    links=[]
    while '<h3 class="r">' in content:
        content=content.split('<h3 class="r">', 1)[1]
        split_content=content.split('</h3>', 1)
        link='http'+split_content[1].split(':http',1)[1].split('%',1)[0]
        links.append(link)
        #content=split_content[1]
    for link in links[:5]:
        print(link)

startup()

回答1:

I looked at the results coming back from your code, and I think you can substantially reduce your code by looking for the <cite> tags:

def startup():
    os.system('cls')
    print('Discord To Profile')
    user = raw_input('Discord Tag: ')
    r = requests.get('https://www.google.ca/search?source=hp&q=' + user + ' site:v3rmillion.net')
    soup = BeautifulSoup(r.text, "html.parser")
    links=[]
    for link in soup.find_all('cite'):
        links.append(link.string)
    for link in links[:5]:
        print(link)