Django collectstatic from Heroku pushes to S3 ever

2019-03-10 03:45发布

I'm using django-storages for static files with S3 (and S3BotoStorage). When I do collectstatic from my local machine, the behaviour is as expected, where only modified files are pushed to S3. This process needs python-dateutils 1.5 to check for modified time.

However, doing the same on Heroku results in every file being pushed regardless, although the setup is the same. I then looked into the modified time of the files on Heroku itself, and it seems like, the os.stat(static_filename).st_mtime is the same as the time of the last push.

Is this expected behaviour? Does heroku copy around files even when there is no change from git?

6条回答
姐就是有狂的资本
2楼-- · 2019-03-10 04:23

Try setting DISABLE_COLLECTSTATIC=1 as an environment setting for your app - that should disable it from running on every push.

See this article for details - https://devcenter.heroku.com/articles/django-assets :

> Sometimes, you may not want Heroku to run collectstatic on your behalf.
> You can disable collectstatic by enabling user-env-compile as well:

$ heroku labs:enable user-env-compile
$ heroku config:set DISABLE_COLLECTSTATIC=1

I've found that simply setting the config will do - no need to also enable user-env-compile - it may be that that this has passed from labs into production?

NB the deployment is managed by the Heroku python buildpack, which you can see here - https://github.com/heroku/heroku-buildpack-python/

EDIT 1

I've just done a bunch of tests on this, and can confirm that DISABLE_COLLECTSTATIC does indeed disable collectstatic, irrespective of the user-env-compile setting - I think that's now in the main trunk (but that's speculation). Doesn't seem to care what the setting is - if DISABLE_COLLECTSTATIC exists as a config var it is used.

查看更多
干净又极端
3楼-- · 2019-03-10 04:24

I agree this is annoying- there's a couple things you can do. I override the collectstatic command and wire it up in my production settings. Below is the command I use:

```

from django.core.management.base import BaseCommand
class Command(BaseCommand):
    args = '< none >'
    help = "disables collectstatic cmd in contrib"
    def handle(self, *args, **kwargs):
        print 'collectstatic disabled'

```

I keep this in mysite/disablecollectstatic/management/commands Then in production settings:

INSTALLED_APPS += ('mysite.disablecollectstatic',)

Alternatively you could use the fact that Heroku does a dry run first before actually invoking the command. If it fails, it won't run it, which means you could contrive an error (by maybe deleting the static root in your settings, for example) but this approach makes me nervous:

https://devcenter.heroku.com/articles/django-assets#detection

查看更多
戒情不戒烟
4楼-- · 2019-03-10 04:25

I've just had that exact same issue and contacted Heroku's support to find out what is going on. My question to them was

I've run into a funky issue doing some deployments. It appears that on each push the date modified on all files is updated to the time a new deploy/git push happens. Is this intended behaviour?

When considering that Django's collectstatic command only checks the modified date on files when evaluating if the file should be copied across to the final storage backend for static assets, it means that on each new push, all files are first removed from the remote storage (in this case S3) and then re-uploaded. This is both a very slow and wasteful process in terms of bandwidth consumed and requests made.

The answer I received today from "Caio", one of Heroku's support staff, was

Hi, that's how it currently works, yes. I'm routing your feedback to our runtime team to see if we can package files with their original dates.

查看更多
做自己的国王
5楼-- · 2019-03-10 04:33

Why not run collectstatic from local machine?

python manage.py collectstatic --noinput --settings=settings.[prod]
查看更多
来,给爷笑一个
6楼-- · 2019-03-10 04:36

I strongly recommend using the collectfast package for any django static deployment to s3, whether local or from your heroku server. It ignores modified dates and utilizes md5 hashes, which the s3 api will provides very quickly, and (optional) caching to make your static deployments zoom. It took my static deployments from ~10-15 minutes to < 2 minutes and only deploys the files that have actually changed.

查看更多
唯我独甜
7楼-- · 2019-03-10 04:38

As confirmed by Alen, Heroku changes the modified date of the files when it deploys. However, Amazon S3 also has an attribute called etag that is an md5 hash of the file content. It's possible to use this to check if the files have changed instead of the modified date, as implemented in this Django snippet.

I took that code, packaged it and fixed some errors I found and put it on Github as django-s3-collectstatic. It includes a new management command fasts3collectstatic that only uploads new files. Check the Github page for installation instructions.

查看更多
登录 后发表回答