Search in default_collection minus a specific coll

2019-08-11 15:18发布

In our GSA index of 500K documents half of the documents are coming from an internal bug tracking system. We have been hearing some power users complain about results from the bug tracking system pushing down other useful results from many other sources. We discussed about using result biasing to lower the importance of bug tracking documents but I am not very keen on this approach as I believe we should let GSA do its magic and decide on the relevancy of the results. Instead what I want to provide users as an option is a UI (checkbox for each collection) where they can pick what collections they want to perform the search.

My non-default collections does not include everything that is under the default_collection. So when user checks each and every checkbox they may think that that is everything in the index while it is not. Because of this I want the checkboxes to behave as exclude rather than include (i,e. check to exclude this collection).

Finally my question: Is there a way to search in the default collection but filter out results that belong to a specific collection (bug tracking collection). When you want to use multiple collections you do &site=col1|col2|col3.. What I am after is something like &site=default_collection-col1 (that's a minus in between).

Is there a way to do this?

Any alternative approaches to this problem?

3条回答
We Are One
2楼-- · 2019-08-11 15:23

Update your frontend to exclude the url patterns mentioned for the bugtracking collection. check this url on your box http://yourGSAEnterpriseCcontroller:8000/EnterpriseController/serve_remove.html

查看更多
虎瘦雄心在
3楼-- · 2019-08-11 15:33

By far the best way to do this is in your collection config. Just create a new collection that has the same include pattern as your default collection and add the pattern from your bug tracking collection as an exclude pattern.

There's no way to do what you're asking purely using query parameters unless you list out every individual collection using the '|' except the one you want and then you're likely to run in to URL length issues.

查看更多
甜甜的少女心
4楼-- · 2019-08-11 15:46

Personally, I would rethink the design of your collections and build more modular collections that you can include. That way as you mentioned you can include OR queries in your site include.

http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076953

A less ideal but more specific solution to your problem is going to be do an exclude by URL in your search query, be aware this can appear in results query search box and looks ugly, but this can be fixed using a simple XSLT change.

To exclude results for a specific site (http://www.google.com/support/enterprise/static/gsa/docs/admin/70/gsa_doc_set/xml_reference/request_format.html#1076964) I would use this sparingly and opt for better design of the collections.

查看更多
登录 后发表回答