How to separate a person's identity from his p

2019-03-20 20:55发布

I'm writing an app which main purpose is to keep list of users purchases.

I would like to ensure that even I as a developer (or anyone with full access to the database) could not figure out how much money a particular person has spent or what he has bought.

I initially came up with the following scheme:

    --------------+------------+-----------
    user_hash     | item       | price
    --------------+------------+-----------
    a45cd654fe810 | Strip club |     400.00
    a45cd654fe810 | Ferrari    | 1510800.00
    54da2241211c2 | Beer       |       5.00
    54da2241211c2 | iPhone     |     399.00
  • User logs in with username and password.
  • From the password calculate user_hash (possibly with salting etc.).
  • Use the hash to access users data with normal SQL-queries.

Given enough users, it should be almost impossible to tell how much money a particular user has spent by just knowing his name.

Is this a sensible thing to do, or am I completely foolish?

7条回答
迷人小祖宗
2楼-- · 2019-03-20 21:41
  1. Create a users table with:
    1. user_id: an identity column (auto-generated id)
    2. username
    3. password: make sure it's hashed!
  2. Create a product table like in your example:
    1. user_hash
    2. item
    3. price

The user_hash will be based off of user_id which never changes. Username and password are free to change as needed. When the user logs in, you compare username/password to get the user_id. You can send the user_hash back to the client for the duration of the session, or an encrypted/indirect version of the hash (could be a session ID, where the server stores the user_hash in the session).

Now you need a way to hash the user_id into user_hash and keep it protected.

  1. If you do it client-side as @no suggested, the client needs to have user_id. Big security hole (especially if it's a web app), hash can be easily be tampered with and algorithm is freely available to the public.
  2. You could have it as a function in the database. Bad idea, since the database has all the pieces to link the records.
  3. For web sites or client/server apps you could have it on your server-side code. Much better, but then one developer has access to the hashing algorithm and data.
  4. Have another developer write the hashing algorithm (which you don't have access to) and stick in on another server (which you also don't have access to) as a TCP/web service. Your server-side code would then pass the user ID and get a hash back. You wouldn't have the algorithm, but you can send all the user IDs through to get all their hashes back. Not a lot of benefits to #3, though the service could have logging and such to try to minimize the risk.
  5. If it's simply a client-database app, you only have choices #1 and 2. I would strongly suggest adding another [business] layer that is server-side, separate from the database server.

Edit: This overlaps some of the previous points. Have 3 servers:

  • Authentication server: Employee A has access. Maintains user table. Has web service (with encrypted communications) that takes user/password combination. Hashes password, looks up user_id in table, generates user_hash. This way you can't simply send all user_ids and get back the hashes. You have to have the password which isn't stored anywhere and is only available during authentication process.
  • Main database server: Employee B has access. Only stores user_hash. No userid, no passwords. You can link the data using the user_hash, but the actual user info is somewhere else.
  • Website server: Employee B has access. Gets login info, passes to authentication server, gets hash back, then disposes login info. Keeps hash in session for writing/querying to the database.

So Employee A has user_id, username, password and algorithm. Employee B has user_hash and data. Unless employee B modifies the website to store the raw user/password, he has no way of linking to the real users.

Using SQL profiling, Employee A would get user_id, username and password hash (since user_hash is generated later in code). Employee B would get user_hash and data.

查看更多
登录 后发表回答