Need explanation for Docpad persistence

I am pretty confused with the architecture behind how data is persisted in Docpad. From blogs and forums, I got to know in-memory (and/or out directory) is used for generated contents. But one of the selling points of Docpad is "completely file based". From the sound of it, hosting it on Heroku or any ephemeral file system doesn't seem logical. Can anyone give some explanation/clarification?

回答1:

DocPad is pitched as a next generation web architecture. This mindmap showcases why we call it that perfectly:

DocPad Architecture Vision http://d.pr/i/jmmZ+

The workflow being like so:

Importers bring data in, from any source, be it the local file system, or tumblr, or mongo database.
These get injected into the DocPad in-memory database
At generation time, DocPad will then render what needs to be rendered, and output static content into the out directory
Dynamic documents (documents that re-render on each request) and dynamic abilities (server extensions) are now able to make use of the in memory database and perform advanced cool stuff like file uploads, contact forms, search pages, whatever

In that sense, DocPad is a next generation web architecture that has static site generation abilities, as well as dynamic site generation abilities. What separates DocPad from traditional web architectures, is that traditional web architectures consider the content and templating separate beings, where DocPad considers them the same and just separated by their extension. Traditional web architectures also are dynamic by defaults, with static site generation accomplished via caching, rather than the other way round of being static by default.

Because of this load everything in the in-memory database situation, we are suffering some from growing pains with performance during generation and post-generation. Discussion here. However there is nothing there that can't be fixed with enough time and resources. Regardless of this, DocPad will still be faster than your traditional web architecture due to the static nature (faster requests) as well as the asynchronous nature (faster generations).

In terms of how you would handle file uploads:

If you are doing a static website with DocPad, you would have a backend API server somewhere else that you would do the upload too and load the data into DocPad as a single page application style.
If you are doing a dynamic website with DocPad, you would host DocPad on a server like Heroku, and extend the server to handle the file upload to a destination like Amazon S3, Dropbox, or into MongoDB or the like. You can then choose to expose the file via templateData as a link, or inject the file into the DocPad in-memory database as a file. Which one you chose is whether or not you just want to reference the upload or treat it as a first class citizen in the DocPad universe (it gets it's own URL and page).

For dynamic sites, I would say I really go with the static site + single page application approach. You get benefits like responsive design, offline support, really fast UX which without doing it that way, you struggle a bit accomplishing it with the dynamic site approach regardless of which web architecture you build it on.

回答2:

Well, I can't top off Benjamin's excellent explaination, but if you want a TLDR explaination:

docpad is used to (biggest-use-case) generate STATIC websites, a-la github pages or old websites of 1990s. You can write your pages in whatever you like (Jade, eco, coffeescript, etc) and it will compile the pages and output HTML files. Think of it as a "Compile-once-server-forever" thing.

On the other hand, if you want Dynamic content on your site, you'd like to use Nodejs for pulling in the dynamic data from other sites, or generating it on the fly.

As for your concern about Heroku's ephemeral file system, (I don't know exactly how what works) you can use Amazon's S3 for storage. Check out this