Python Praw skipping sticky in subreddits

2019-09-13 04:41发布

问题:

I am trying to loop through subreddits, but want to ignore the sticky posts at the top. I am able to print the first 5 posts, unfortunately including the stickies. Various pythonic methods of trying to skip these have failed. Two different examples of my code below.

            subreddit = reddit.subreddit(sub)
            for submission in subreddit.hot(limit=5):

                # If we haven't replied to this post before
                if submission.id not in posts_replied_to:
                    ##FOOD

                    if subreddit == 'food':

                        if 'pLEASE SEE' in submission.title:
                            pass
                        if "please vote" in submission.title:
                            pass
                        else:
                            print(submission.title)
                        if re.search("please vote", submission.title, re.IGNORECASE):
                            pass
                        else:

                            print(submission.title)

I noticed a sticky tag in the documents but not sure exactly how to use it. Any help is appreciated.

回答1:

It looks like you can get the id of a stickied post based on docs. So perhaps you could get the id(s) of the stickied post(s) (note that with the 'number' parameter of the sticky method you can say give me the first, or second, or third, stickied post; use this to your advantage to get all of the stickied posts) and for each submission that you are going to pull, first check its id against the stickied ids.

Example:

# assuming there are no more than three stickies...
stickies = [reddit.subreddit("chicago").sticky(i).id for i in range(1,4)]

and then when you want to make sure a given post isn't stickied, use:

if post.id not in stickies:
    do something

It looks like, were there fewer than three, this would give you a list with duplicate ids, which won't be a problem.



回答2:

Submissions which are stickied have a sticked attribute that evaluates to True. Add the following to your loop, and you should be good to go.

if submission.stickied:
    continue

In general, I recommend checking the available attributes on the objects you are working with to see if there is something usable. See: Determine Available Attributes of an Object



回答3:

As an addendum to @Al Avery's answer, you can do a complete search for the IDs of all stickies on a given subreddit by doing something like

def get_all_stickies(sub):
    stickies = set()
    for i in itertools.count(1):
        try:
            sid = sub.sticky(i)
        except pawcore.NotFound:
            break
        if sid in stickies:
            break
        stickies.add(sid)
    return stickies

This function takes into account that the documentation lead one to expect an error if an invalid index is supplied to stick, while the actual behavior seems to be that a duplicate ID is returned. Using a set instead of a list makes lookup faster if you have a large number of stickies. You would use the function as

subreddit = reddit.subreddit(sub)
stickies = get_all_stickies(subreddit)
for submission in subreddit.hot(limit=5):
    if submission.id not in posts_replied_to and submission.id not in stickies:
        print(submission.title)