The question is how the string matching is done to find matching entries by the firefox 3 url bar. Substring matching on every entry might be slow. What algorithm can be used to match at any location in a fast way?
相关问题
- how to split a list into a given number of sub-lis
- Generate string from integer with arbitrary base i
- Finding k smallest elements in a min heap - worst-
- binary search tree path list
- High cost encryption but less cost decryption
相关文章
- JSP String formatting Truncate
- What are the problems associated to Best First Sea
- Selecting only the first few characters in a strin
- Firefox remembering radio buttons incorrectly
- Coin change DP solution to keep track of coins
- Python: print in two columns
- Algorithm for partially filling a polygonal mesh
- Robust polygon normal calculation
The normal way to do fast substring matching is to create a data structure which contain all the suffixes of all the strings you want search. Depending on the organization, this can be called the "suffix tree" or "suffix array". For example, if you have 1000 strings and every one is 50 characters long, you have 1.000 x 50 nontrivial suffixes, i.e. your suffix array would have 50.000 entries.
Then to do the matching, you do binary search (if array) or tree search (if tree) to find all those suffixes in the data structure whose beginning matches the string written in the search box. Because it is the beginning of the suffix you are matching, you can use standard search procedures (binary search, tree descent) to get to the result fast. Every suffix is linked to those strings in which it appears.
Example: you have two strings "CAT" and "DOT". Your suffix array looks like this (note lexiographic = alphabetic ordering):
Note that there are six suffixes but two of the are the same (last "T" in "CAT" and "DOT") and the are both represented by the same entry (#5).
Now when the user types in search, e.g. "OT" (which should match "DOT"), you can do simple lexiographic ordering lookup in log-time as you are now searching for a beginning match in the suffix array.
This is the basic principle of fast text searching when the search pattern does not contain wildcards.
I think this is left to the underlying storage: the SQLite Database where Firefox stores the Pages you visited offers a fast function for substring comparison.
I think Firefox issues a SQL Query to the Database. This is quite fast because the database is cached in your memory.
The awesomebar suggests URLS by an algorithm called The Places frecency algorithm.
According the Mozilla developer website:
The algorithm the location bar in Firefox 3.0 is bit complicated. It will get data from two (three for Firefox 3.5 and later) different queries:
For the first query, it checks the moz_inputhistory table to see if the current search string is stored in that table. These results are sorted by "rank" which is a number that determines how recently it is used. This number degrades once a day. This search is what makes the location bar adaptive to what you select over time (implemented in bug 95739).
In Firefox 3.5 and later, it then checks the moz_keyword table for bookmarks with a keyword that match the search text exactly.
The last query, it goes through every entry in moz_places, which will include all history visits and bookmarks. These results are ordered by frecency.
For all three of these cases, the following algorithm is used for matching against the tags, title, and the url (called "searchable text" below). This is a bit difficult to explain in words, so it might be easier to look at the source code.
Hopefully that explains it in an understandable way!