This question already has an answer here:
-
How secure would github hosting be for private repositories? [closed]
4 answers
This is just a question out of curiosity. I am wondering how safe it is generally considered to host sensitive data on repository websites like Github, Bitbucket, etc.? Is it safe enough to get rid of all code on local machines and just store it all on there? How about safety in the sense of keeping company secrets? I notice these sites tout big companies like Google and Yahoo use their services, but do these big companies actually store their trade secrets and important company code on websites like this?
Github has a page (http://help.github.com/security), which has some interesting information, that shows they are marketing it as something fool proof like I described. But in practice, do big companies like Google really find that their proprietary secrets and massive amounts of code are really safe from prying eyes and disastrous occurrences on sites like these?
As always, it depends :-)
There can be two different meanings of "safety":
- Can I trust the hoster to keep my stuff (intellectual property, company secrets...) private?
- What happens to my code if the hoster suddenly goes out of service?
For 1., there is no 100% guarantee.
Of course, the big hosters like GitHub and Bitbucket won't share your code intentionally with third parties, but there is always the possibility that some hacker manages to get the content of your private repositories.
(this could happen to you as well if you host your code internally in your company, but this is unlikely, because unless your company is as known as, say, Google, the chance of someone trying to attack your company is much smaller than the chance of someone trying to attack a well-known public hoster).
Plus, you have to consider the laws of the country where the hoster resides.
A few weeks ago I read somewhere that if your hoster is in the USA, they can be forced by law to give your data to the US government under certain circumstances, and they are not even allowed to tell you about that (I don't remember the name of the law, but maybe someone else knows).
I guess that all this causes most "big" companies to not host their code on a public service (my company is mid-sized, and we host our code private as well).
By the way, as you mentioned Google:
I'm sure that especially Google does not use Bitbucket or GitHub. They have the complete infrastructure for project hosting themselves, so I guess they are using it internally, too. Why should they use an external service? It's in the cloud, yes...but it's their cloud.
Concerning 2.: it's unlikely that GitHub or Bitbucket will go bankrupt tomorrow, but you never know.
IMO it's your responsibility to take backups of your code yourself.
The nature of DVCS makes sure that you have some local copies of your code anyway, but it might be difficult to search lots of developer machines for the newest versions of all of your projects.
I do this by pulling all my repositories to my local machine regularly (I wrote a tool that can do this for Bitbucket, which I use for my private projects)
For the record:
First, a repository is a backup then later its about security.
To the date, we are yet to see a security breach that involves GitHub or Bitbucket. So, empirically speaking, they are safe.
However, we are showing our information to a private company, so there is a risk, for example a Github employee that decides to copy our stuff.
But, we should remember that a repository is mainly a backup. Having a private server is fine if you own the resources. But, we should also consider the location of the server. If its in the same location then there is a chance of losses all the information, for example a flood, fire, a thunderbolt frying all our machines and so on. So, a remote repository is really cool.
If you want to use a remote repository then, don't be too obvious. Lets say that you are Cocacola Corp and you want to manage an important project, then don't create a account with the name of the business and don't call the project as IMPORTANT_SECRET_VITAL_FOR_COCACOLA, just call it PROJECT1 and if a hackers attacks then he will not care about it.
One key questions is who has the administrative access. That or those person can always read your data, and potentially leak this out to third parties knowingly or not knowingly or just read it for their own entertainment or education. This is not only a problem for hosted services, this is also a problem if your store your data within your own company. But at least you know the person. For small companies the administrative password might be in the hands of the business owner.
The main point is that the public code hosting companies are such a huge target. There is a lot to gain from hacking such a large code repository. This is a very interesting target for government agencies, so big that they just get an insider into the hosting company who just takes a USB stick with all the data on his way home. This might be as easy as just applying for an admin job there and even get paid with all benefits. I don't think we will ever see any news about this, simply because there are no traces to be expected, unless someone wants to brag about it. Hosting companies as far as I know don't require security clearances anything like government agencies do. And the fact that this all will be in stealth mode puts very little pressure on a hosting company to actually do anything about it.