Protecting REST API behind SPA against data thiefs

2019-07-16 12:51发布

I am writing a REST Api gateway for an Angular SPA and I am confronted with the problem of securing the data exposed by the API for the SPA against "data thiefs". I am aware that I can't do much against HTML scraping, but at least I don't want to offer such data thiefs the user experience and full power of our JSON sent to the SPA.

The difference between most "tutorials" and threads about this topic is that I am exposing this data to a public website (which means no user authentication required) which offers valuable statistics about a video game.

My initial idea on how to protect the Rest API for SPA:

Using JWTs everywhere. When a visitor opens the website the very first time the SPA requests a JWT from my REST Api and saves it in the HTTPS cookies. For all requests the SPA has to use the JWT to get a response.

Problems with that approach

  • The data thief could simply request the oauth token from our endpoint as well. I have no chance to verify that the token has actually been requested from my SPA or from the data thief?
  • Even if I solved that the attacker could read the saved JWT from the HTTPS cookies and use it in his own application. Sure I could add time expiration for the JWT

My question:

I am under the impression that this is a common problem and therefore I am wondering if there are any good solutions to protect against others than the SPA having direct access to my REST Api responses?

1条回答
forever°为你锁心
2楼-- · 2019-07-16 13:37

From the API's point of view, your SPA is in no way different than any other client. You obviously can't include a secret in the SPA as it is sent to anybody and cannot be protected. Also the requests it makes to the API can be easily sniffed and copied by another client.

So in short, as diacussed many times here, you can't authenticate the client application. Anybody can create a different client if they want.

One thing you can actually do is checking the referer/origin of requests. If a client is running in a browser, thr requests it can make are somewhat limited, and one such limitation is the referer and origin headers, which are always controlled by the browser, and not javascript. So you can actually make sure that if (and only if!) the client is running in an unmodified browser, it is downloaded from your domain. This is the default in browsers btw, so if you are not sending CORS headers, you already did this (browsers do, actually). However, this does not keep an attacker from building and running a non-browser client and fake any referer or origin he likes, or just disregard the same origin policy.

Another thing you could do is changing the API regularly just enough to stop rogue clients from working (and changing your client at the same time ofc). Obviously this is not secure at all, but can be annoying enough for an attacker. If downloading all your data once is a concern, this again doesn't help at all.

Some real things you should consider though are:

  • Does anybody actually want to download your data? How much is it worth? Most of the times nobody wants to create a different client, and nobody is that much interested in the data.

  • If it is that interesting, you should implement user authentication at the very least, and cover the remaining risk either via points below and/or in your contracts legally.

  • You could implement throttling to not allow bulk downloading. For example if the typical user accesses 1 record every 5 seconds, and 10 altogether, you can build rules based on the client IP for example to reasonably limit user access. Note though that rate limiting must be based on a parameter the client can't modify arbitrarily, and without authentication, that's pretty much the client IP only, and you will face issues with users behind a NAT (ie. corporate networks for example).

  • Similarly, you can implement monitoring to discover if somebody is downloading more data than it would be normal or necessary. However, without user authentication, your only option will be to ban the client IP. So again it comes down to knowing who the user is, ie. authentication.

查看更多
登录 后发表回答