可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

First some background to my question.

Individual entities can have read Permissions.
If a user fails a read permission check they cant see that instance.

The probelm relates to introducing Lucene and performing a search which simply returns a list of matching entity instances. My code would then need to filter entities one by one. This approach is extremely inefficient as the situation exists that a user may only be able to see a small minority and checking many to return a few is less than ideal.

What approaches or how would developers solve this problem - keeping in mind that indexing and searches are performed using Lucene ?

EDIT

Definitions

A User may belong to many Groups.
A Role may have many Groups - these can change.
A Permission has a Role - (indirection).
X can have a read Permission.
It is possible for the definition of a Role to change at any time.

Indexing

Adding the set of Groups (expanding a Permmission) at index time may result in the definition becoming out of sync when the list of member groups for a Role change.
I am hoping to avoid having to reindex X whenever the definition of a Permission/Role changes.

Security Check

To pass a Permission check a User must belong to a group that is within the set of groups belong to the Role for a given Permission.

回答1:

It depends on the number of different security groups that are relevant in your context and how the security applies to your indexed data.

We had a similar issue which we solved the following way: When indexing we added the allowed groups to the document and when searching we added a boolean query with the groups the user was a member of. That performed well in our scenario.

回答2:

It depends on your security model. If permissions are simple - say you have three classes of documents - It is probably best to build a separate Lucene index per class, and merge the results when a user can see more than one class. The Solr security Wiki suggests something similar to HakonB's suggestion - adding user's credentials to the query and searching by them. See also this discussion in the Lucene user group. Another strategy will be to wrap the Lucene search with a separate security class that does additional filtering out of Lucene. It may be faster if you can do this using a database for the permissions.

Edit: I see you have a rather complex permission system. Your basic design choice is whether to implement it inside Lucene or outside Lucene. My advice is to use Lucene as a search engine (its primary strength) and use another system/application for security. If you choose to use Lucene for security anyway, I suggest you learn Lucene Filters well, and use a bitset filter in order to filter a query's results. It does have the problems you listed of having to keep the permissions updated.

回答3:

As Yuval mentioned, it might be worth having the permission mechanism independent of the lucene index.

One way to do it is to implement your own Collector, that will filter out the results that the user should not have access to.

回答4:

What I would suggest is having two kind of documents:

1) Real_documents with a field called: "DocumentID"

2) A security document with fields: "Role" "Groups" "Users" "PermisionId" "DocumentsIds"

then a pseudo-code could be:

   Field[] docIds =searcher.search("Users", "currentUser").getFields("DocumentIds");
   TermsFilter filter = new TermFilter();

   foreach(field:docIDs){
       filter.add(new Term(field.field(),field.text());
   }
   searcher.search(query.getWeight(searcher), filter, numberOfDocuments);

Being that Lucene is very fast on searching two searches are really easy to make. In this way you also have a better tf-idf per user.