I am new user to SOLR. I am working on an E-commerce web application which have SQL database. I want to implement SOLR for my "category page" in application where we will show products of that category with specific information like available stock , price and few more details. Also we want to restrict product display on basis of stock availability, if there is no stock then we wont display those products.
I am trying to implement SOLR with Delta import queries to make my category pages faster. And my concern is about performance of page while getting data from SOLR and accuracy of real time data like Stock and Price.
actually my database queries to get product data are little bit complicated and have several joins, so i have to make several related entities in SOLR database. So due to this data upload to SOLR is slow. that makes difficult to upload data (even with delta import query) frequently, so my application lacking real time data like stock of products.
basically i want to know best practice approach to implement SOLR. I am confused with
1. Should i export my all data to SOLR and then get all details from SOLR?
(I am worried about performance and real time data)
2. should i get only index data (say id of products) from SOLR and then get othere details from my SQL database? (not sure about this approach perfomance).
So please help me and suggest me that how can i implement SOLR for my app in best way.
all help is appretiated!!
One good case practice I encountered while developing e-commerce solution and SOLR as the search provider is to retrieve from SOLR only the IDs and get the data from SQL server.
I created a single schema that was updating every time some new products were added in the database.(In my case products were added by a user in an admin console, in your case you can use the @Mauricio Scheffer comment to have latest updates)
One field in the schema was the ID - representing the ID of the product in the database.
After querying SOLR I was receiving N documents suiting the query and with the ID field I was getting all the information from my database and display it to the user.
So the good part is that the user will always get the data from your database ( real time data), and will have his search results displayed very fast because of SOLR.
You can add to the schema different fields that you can use to filter your results such as
- category
- date
- price
- size
- etc...
And also differnet fields that you need to query upon:
- headline
- description
- name
- etc...
And also add the product ID.
So after making the query to SOLR you have a list of product IDs, now the only thing you need to do is to implement a function that will get the products from database based on the ID and display it on the search results page.
This approach performance is pretty good because selecting from database elements based on primary key is the fastest way to retrieve data from sql.
I worked on a website with 1.000.000 products and searching time was always under 0.2 seconds
and page loading time in the browser was under 0.6 seconds on single user queries. ( the server where SOLR and SQL was running was 8Gb ram and quad core 1.8 gb as I remember)
Hope this type of implementation is useful for you.
You can set up a database trigger or data access layer event to send data to Solr whenever there's a change, and configure autoCommit to control freshness.
See also:
- http://wiki.apache.org/solr/NearRealtimeSearch
- http://java.dzone.com/articles/tuning-solr-near-real-time