Questions about SOLR documents and some more

2020-03-06 03:44发布

问题:

Website: Classifieds website (users may put ads, search ads etc)

I plan to use SOLR for searching and then return results as ID nr:s only, and then use those ID nr:s and query mysql, and then lastly display the results with those ID:s.

Currently I have around 30 tables in MySQL, one for each category.

1- Do you think I should do it differently than above?

2- Should I use only one SOLR document, or multiple documents? Also, is document the same as a SOLR index?

3- Would it be better to Only use SOLR and skip MySQL knowing that I have alot of columns in each table? Personally I am much better at using MySQL than SOLR.

4- Say the user wants to search for cars in a specific region, how is this type of querying performed/done in SOLR? Ex: q=cars&region=washington possible?

You may think there is alot of info about SOLR out there, but there isn't, and especially not about using PHP with SOLR and a SOLR php client... Maybe I will write something when I have learned all this... Or maybe one of you could write something up!

Thanks again for all help...

回答1:

First, the definitions: a Solr/Lucene document is roughly the equivalent of a database row. An index is roughly the same as a database table.

I recommend trying to store all the classified-related information in Solr. Querying Solr and then the database is inefficient and very likely unnecessary.

Querying in a specific region would be something like q=cars+region:washington assuming you have a region field in Solr.

The Solr wiki has tons of good information and a pretty good basic tutorial. Of course this can always be improved, so if you find anything that isn't clear please let the Solr team know about it.

I can't comment on the PHP client since I don't use PHP.



回答2:

Solr is going to return it's results in a syntax easily parsible using SimpleXml. You could also use the SolPHP client library: http://wiki.apache.org/solr/SolPHP.

Solr is really quite efficient. I suggest putting as much data into your Solr index as necessary to retrieve everything in one hit from Solr. This could mean much less database traffic for you.

If you've installed the example Solr application (comes with Jetty), then you can develop Solr queries using the admin interface. The URI of the result is pretty much what you'd be constructing in PHP.

The most difficult part when beginning with Solr is getting the solrconfig.xml and the schema.xml files correct. I suggest starting with a very basic config, and restart your web app each time you add a field. Starting off with the whole schema.xml can be confusing.



回答3:

2- Should I use only one SOLR document, or multiple documents? Also, is document the same as a SOLR index?

3- Would it be better to Only use SOLR and skip MySQL knowing that I have alot of columns in each table? Personally I am much better at using MySQL than SOLR.

A document is "an instance" of solr index. Take into account that you can build only one solr index per solr Core. A core acts as an independent solr Server into the same solr insallation.

http://wiki.apache.org/solr/CoreAdmin

Yo can build one index merging some table contents and some other indexes to perform second level searches...

would you give more details about your architecture and data??



回答4:

As suggested by others you can store and index your mysql data and can run query in solr index, thus making mysql unnecessary to use. You don't need to just store and index ids and query and get ids and then run mysql query to get additional data against that id. You can just store other data corresponding to ids in solr itself.

Regarding solr PHP client, then you don't need to use and it is recommended to directly use REST like Solr Web API. You can use PHP function like file_get_contents("http://IP:port/solr/#/core/select?q=query&start=0&rows=100&wt=json") or use curl with PHP if you need to. Both ways are almost same and efficient. This will return data in json as wt=json. Then use PHP function json_decode($returned_data) to get that data in object.

If you need to ask anything just reply.