What Is Search algorithms, Retrieval and ranking?

What Is Search algorithms, Retrieval and ranking?

All of the parts of the search engine are important, but the search algorithm is the cog that makes everything work. It might be more accurate to say that the search algorithm is the foundation on which everything else is built. How a search engine works is based on the search algorithm, or the way that data is discovered by the user. 

In very general terms, a search algorithm is a problem-solving procedure that takes a problem, evaluates a number of possible answers, and then returns the solution to that problem. A search algorithm for a search engine takes the problem (the word or phrase being searched for), sifts through a database that contains cataloged keywords and the URLs those words are related to, and then returns pages that contain the word or phrase that was searched for, either in the body of the page or in a URL that points to the page.

This neat little trick is accomplished differently according to the algorithm that’s being used. There are several classifications of search algorithms, and each search engine uses algorithms that are slightly different. That’s why a search for one word or phrase will yield different results from different search engines. Some of the most common types of search algorithms include the following:

  • List search: A list search algorithm searches through specified data looking for a single key. The data is searched in a very linear, list-style method. The result of a list search is usually a single element, which means that searching through billions of web sites could be very time-consuming, but would yield a smaller search result.
  • Tree search: Envision a tree in your mind. Now, examine that tree either from the roots out or from the leaves in. This is how a tree search algorithm works. The algorithm searches a data set from the broadest to the most narrow, or from the most narrow to the broadest. Data sets are like trees; a single piece of data can branch to many other pieces of data, and this is very much how the Web is set up. Tree searches, then, are more useful when conducting searches on the Web, although they are not the only searches that can be successful.
  • SQL search: One of the difficulties with a tree search is that it’s conducted in a hierarchical manner, meaning it’s conducted from one point to another, according to the ranking of the data being searched. A SQL (pronounced See-Quel) search allows data to be searched in a non-hierarchical manner, which means that data can be searched from any subset of data. 
  • Informed search: An informed search algorithm looks for a specific answer to a specific problem in a tree-like data set. The informed search, despite its name, is not always the best choice for web searches because of the general nature of the answers being sought.
  • Adversarial search: An adversarial search algorithm looks for all possible solutions to a problem, much like finding all the possible solutions in a game. This algorithm is difficult to use with web searches, because the number of possible solutions to a word or phrase search is nearly infinite on the Web. 
  • Constraint satisfaction search: When you think of searching the Web for a word or phrase, the constraint satisfaction search algorithm is most likely to satisfy your desire to find something. In this type of search algorithm, the solution is discovered by meeting a set of constraints, and the data set can be searched in a variety of different ways that do not have to be linear. Constraint satisfaction searches can be very useful for searching the Web.

These are only a few of the various types of search algorithms that are used when creating search engines. And very often, more than one type of search algorithm is used, or as happens in most cases, some proprietary search algorithm is created. The key to maximizing your search engine results is to understand a little about how each search engine you’re targeting works. Only when you understand this can you know how to maximize your exposure to meet the search requirements for that search engine.

Retrieval and ranking

For a web search engine, the retrieval of data is a combination activity of the crawler (or spider or robot), the database, and the search algorithm. Those three elements work in concert to retrieve the word or phrase that a user enters into the search engine’s user interface. And as noted earlier, how that works can be a proprietary combination of technologies, theories, and coding whizbangery. 

The really tricky part comes in the results ranking. Ranking is also what you’ll spend the most time and effort trying to affect. Your ranking in a search engine determines how often people see your page, which affects everything from revenue to your advertising budget. Unfortunately, how a search engine ranks your page or pages is a tough science to pin down.

The most that you can hope for, in most cases, is to make an educated guess as to how a search engine ranks its results, and then try to tailor your page to meet those results. But keep in mind that, although retrieval and ranking are listed as separate subjects here, they’re actually part of the search algorithm. The separation is to help you better understand how search engines work.

Ranking plays such a large part in search engine optimization that you’ll see it frequently in this book. You’ll look at ranking from every possible facet before you reach the last page. But for now, let’s look at just what affects ranking. Keep in mind, however, that different search engines use different ranking criteria, so the importance each of these elements plays will vary.

  • Location: Location doesn’t refer here to the location (as in the URL) of a web page. Instead, it refers to the location of key words and phrases on a web page. So, for example, if a user searches for “puppies,” some search engines will rank the results according to where on the page the word “puppies” appears. Obviously, the higher the word appears on the page, the higher the rank might be. So a web site that contains the word “puppies” in the title tagwill likely appear higher than a web site that is about puppies but does not contain the word in the title tag. What this means is that a web site that’s not designed with SEO in mind will likely not rank where you would expect it to rank. The site www.puppies.comis a good example of this. In a Google search, it appears ranked fifth rather than first, potentially because it does not contain the key word in the title tag.
  • Frequency: The frequency with which the search term appears on the page may also affect how a page is ranked in search results. So, for example, on a page about puppies, one that uses the word five times might be ranked higher than one that uses the word only two or three times. When word frequency became a factor, some web site designers began using hidden words hundreds of times on pages, trying to artificially boost their page rankings. Most search engines now recognize this as keyword spamming and ignore or even refuse to list pages that use this technique.
  • Links: One of the more recent ranking factors is the type and number of links on a web page. Links that come into the site, links that lead out of the site, and links within the site are all taken into consideration. It would follow, then, that the more links you have on your page or leading to your page the higher your rank would be, right? Again, it doesn’t necessarily work that way. More accurately, the number of relevant links coming into your page, versus the number of relevant links within the page, versus the number of relevant links leading off the page will have a bearing on the rank that your page gets in the search results. 
  • Click-throughs: One last element that might determine how your site ranks against others in a search is the number of click-through's our site has versus click-throughs for other pages that are shown in page rankings. Because the search engine cannot monitor site traffic for every site on the Web, some monitor the number of clicks each search result receives. The rankings may then be repositioned in a future search, based on this interaction with the users. 

Page ranking is a very precise science. And it differs from search engine to search engine. To create the best possible SEO for your site, it’s necessary to understand how these page rankings are made for the search engines you plan to target. Those factors can then be taken into consideration and used to your advantage when it’s time to create, change, or update the web site that you want to optimize.

Characteristics of Search 

Understanding how a search engine works helps you to understand how your pages are ranked in the search engine, but how your pages are found is another story entirely. That’s where the human element comes in. Search means different things to different people. For example, one of my colleagues searches the Internet using the same words and phrases he would use to tell someone about a topic or even the exact question that he’s trying to get answered. It’s called natural language. Another, however, was trained in search using Boolean search techniques. She uses a very different syntax when she’s creating a search term. Each of them returns different search results, even when each is using the same search engines.

The characteristics of search refer to how users search the Internet. This can be everything from the heuristics they use when creating a search term to the selection the user makes (and the way those selections are made) once the search results are returned. One interesting fact is that more than half of American adults search the Internet every time they go online. And in fact, more people search the Internet than use the yellow pages when they’re looking for phone numbers or the locations of local businesses. 

This wealth of search engine users is fertile ground for SEO targeting. And the better you understand how and why users use search engines, and exactly how search engines work, the easier it will be to achieve the SEO you’re pursuing.


Share on Social Media: