1, to write this article for a long time, but has been entangled in the search engine technology principle details specific, see more, do not know where is the delay can not start, this is the birth of Bob’s non pain.
2, everyone knows that search is a complex thing. This article tries to think mainly from a non-technical point of view and build a station search system for B2C sites without too much technical detail. As for the specific implementation cost here do not consider (by a simple sql+ cache with Lucene or Sphinx, fix the full-text search engine two times development, even for Baidu to buy Google Code two times, listen to your program ape, you zuobulezhu).
3, this article has repeatedly mentioned station search, rather than station search engine, there is a big difference between the two (I’m not sure whether the final design will be a real station search engine).
4, this article refers to a lot of information, cited as follows, for reference, learning
"Web Information Architecture – designing large websites" is a classic book (not recommended for novices)
Several articles in the http://s.blog.xiqiao.info/2009/06/02/343
beauty of CiccioSeveral articles from
and some papers on full text retrieval
after that, we go into the text
1, you need to think about the following 2 questions before you start thinking about planning a B2C station search,
search in the station to solve the problem and significance
describes 2 common search scenarios,
Xiao Li, a user, has become more familiar with the website A, and wants to buy a computer. At this point, he knows that the website A has computer sales. Direct input keywords: Thinkpad X series for more accurate inquiries.
, a user of white, heard of B2C website A, landing for the first time, saw a dazzling array of goods. Just browse through similar websites before, or have a more comprehensive understanding of the current categories of goods. Want to quickly locate some of the goods already in mind. Then enter a broader keyword for fuzzy search: such as input wool coat, cotton T-shirt and other vague keywords.
(1) station search just meets the needs of these two types of users.
(2) understands the potential needs of users by analyzing the frequency of keyword search by users. I’ve always had an idea for this. If you find a large number of search keywords for a certain category of A, it happens to be a web site. In order to reduce the risk, the website can use the predetermined way to put on the A which is consistent with the target keyword