I've found some similar questions (here, here, and here) asking about storing documents into version control. I have a more specific requirement and general question. The specific requirement is that I want to use Git. The more general question is, how should documents (for design, test, general practices, tips, etc, of a project) be stored in Git? More broadly, what documents should be stored?
I can think of a few ways:
- Word / Open Office documents. The new Office Word has docx format, which zips up documents, but it also has an unzipped XML format, which could be used to efficiently store diffs in Git. The diff feature is still broken though, since the XMLs are squished on a single line. This is no better than storing a binary file into Git.
- Wiki. What distributed wikis exist out there? It'd be like some kind of Latex thing where documents are written and compiled / viewed as a wiki.
- Latex - but from using it for papers I find it pretty unsuitable for documents. Is there a documentation equivalent? (How are man pages written?)
- Plain text formats, but this is rather lacking due to lack of diagrams, which bring up another point.
How should visuals be stored? What should they be composed in in the first place? I'm developing on a Linux environment, but some other participants in the project are on Windows. What cross-platform solution is there that resembles Visio? And of course, it should not create binary files to be stored into Git. How then would this tie in with documentation? (E.g. Similar to how Latex can reference other diagrams when compiled.)
Most document formats don't play terribly well with source control. Almost everything you list is either effectively a binary format or convoluted markup that won't diff well. As long as you just want versions of documents and don't care about the diff, use whatever format you like. I prefer Microsoft Word documents because you can use the built-in change tracking and comment system to track deltas between documents.
As for what documents to store, I would recommend storing anything you'll have a use for later. What documents could be used by someone to continue the project should you leave? What documents would be helpful to bring a new person up to speed? This means specifications, but not documents like burndown charts.
To answer the wiki part of your question, check out DokuWiki. It stores everything in text files so they would be very easy to add into a source control system.
My company stores Word documents in SVN, and accesses them via TortoiseSVN.
Tortoise uses Word's built in change tracking function to show you a "diff" of two revisions.
This works really well, but requires Windows and Word.
Edit:
You could probably get this working with git too. If you install TortoiseSVN, then look in
%PROGRAMFILES%\TortoiseSVN\Diff-Scripts\
, you'll see what tortoise is doing.If you're using git, I assume you're 1337 enough to hack it to work for you :)
For OOo, word documents and other binary files, you should have a look at pro-git http://git-scm.com/book/ch7-2.html
Git can handle binary files just as well as text files. Instead of explicitly storing diffs, Git stores the entire previous revisions of files in the repository. The repository objects are then compressed to save space. Diffs are reconstructed on the fly whenever you ask for them.
So considering only disk space, there is little difference between storing an XML Office document uncompressed in Git, and storing a zipped version of that same document. The only difference would be the relative performance of Zip vs whatever compression Git chooses to use.
When deciding what document format you choose, you should make sure that team members (or are you working alone?) are comfortable working with the format itself.
Storage is not so much the problem as is being able to see diffs between versions and merging. In my experience, nothing beats text formats that can be edited freely in any text editor. This excludes HTML and about any XML-based format. DocBook being a barely usable exception.
A good wiki that can use any of the popular version control systems and be set up in a distributed fashion is IkiWiki. With IkiWiki, markup parsing is done in plug-ins, so you can choose input format on a per-document basis. The "default", Markdown gets pretty close to plain-text formats.
If you're unhappy with using LaTeX, don't use it. I think it's unsuitable for taking quick notes. Man pages are written in nroff, but many people use other formats such as POD.
Some projects that strive to be alternatives to Visio are Kivio (KDE) and Dia (Gtk/Gnome). I haven't used Visio itself, so I can't comment on their feature sets. It probably depends on what sorts of visuals / diagrams you want to create. UML? Flow charts?
For Word documents, try using RTF (rich text format), which is basically text. Another possibility would be HTML. They're text, so you should be able to do diffs on them.
Most Wikis are distributed in that they're designed for collaboration. I think you're really asking about whether there are hosted solutions or do you have to manage them. Take a look at http://www.atlassian.com/.