Wednesday, June 20, 2007

Automated vs Manual. How much faster?

Everyone knows that if the same task can be executed both by a man and a computer, the task will be completed much faster by a computer. The question is how much faster. The answer depends on many things, so let's be specific and estimate the difference for a task that is familiar to us – web search. If you don't want to read all the mathematical details, you can go straight to the conclusions, view the performance graphs (click the thumbnails), or check our online productivity calculator.
Time estimates for search engines with 10 results per page Results per minute estimates for search engines with 10 results per page
Productivity estimates graphs

If you want to know the details of our calculations, keep on reading. The task is to compile a list of search results. Here is the manual approach:

  1. Open a search engine's web page, enter a query, and click “Search”. The timing begins here.
  2. The web page with search results is being loaded. On slow connections and with slow search engines this may take some time. Let's call it “load time” (tload).
  3. Then the searcher should store the results somewhere – Word, Excel, or anything you use to store the gathered search results. This may take anywhere from a dozen seconds to several minutes (e.g., if you want to store URLs, Titles, and Descriptions of search results in separate columns of your spreadsheet or database). Let's call it “store time” (tstore).
  4. Now the searcher clicks “Next” and repeats steps 2 and 3 for the next page of search results.
  5. Lather, rinse, and repeat until the desired number of search results is gathered.

Here is the most simple and straightforward automated approach:

  1. Open a program, enter a query, and click “Search”. The timing begins here.
  2. As in the manual approach, the web page with search results will be compiled by the search engine and then downloaded by our program. This will most likely result in the same “load time” (tload).
  3. Then the program will to store the search results in its list. This will be the “automated store time” (tautomated_store).
  4. Now the program loads the next page of search results.
  5. Lather, rinse, repeat until the desired number of search results is gathered.
  6. If needed, the complete list of search results can be exported to an application of your choice (e.g., Word, Excel, etc.).

Let's assume that the search engine provides N results per page. So the number of pages we need to process will be: Npages = Ntotal_results/Nresults_per_page So far the calculations seem to be pretty easy. The time needed to compile the list of search results will be: Tmanual = Npages * (tload + tstore) and Tautomated = Npages * (tload + tautomated_store) correspondingly. However, while the formula for the automated approach is rather accurate, it is not all that simple with humans. With such repetitive operations, we humans can quickly get tired and bored. It is easy for us to get sidetracked in the middle of a task. We can be interrupted by a phone call or a colleague. We might want to check some search result right away or go get some coffee. All this things make the actual total time for the manual approach greater than result you would obtain using the formula shown above. Let's add this notorious “human factor” to our formula. It is clear that the more pages the searcher needs to process, the more tired she gets, the more chances she will have to get sidetracked or be disturbed. So let's introduce a fatigue coefficient (k) that indicates how much slower the store time will be for each successive page. The fatigue coefficient of 1.05 indicates that it will take 5% more time to store the results for each successive page: e.g., first page - 20 seconds; second -21 seconds; third – 22.05, etc. Now the formula is more accurate but more complex at the same time. Tmanual = n*tload + tstore + tstore*k + tstore*k2 + ... + tstore*kn-1, where n is Npages Most likely you will need a spreadsheet to calculate the total time for a significant number of pages. Or you can use our online productivity calculator. This calculator estimates the total times and RPM (results per minute) for both approaches and compares them in terms of saved time and money (check advanced options). The fatigue coefficient used in the calculator is 1.03 (3%). 8 hour work days used in the estimation for long lasting tasks. Conclusions

  1. The productivity of an automated search solution doesn't depend on the number of search results you need to gather. The cost of processing each page is the same.
  2. The productivity of human searchers decreases with every additional page of search results.

Which approach to choose?

  • If you only need one page of search results, you shouldn't bother with any specialized solution. Just go to your favorite search engine.
  • If you need to collect search results from about a dozen web pages, the advantages of an automated solution will be minimal. You should consider automation only if you often perform such tasks.
  • If you need to process really large volumes of search results, an automated solution is the only reasonable choice. It would take a whole work day for a human search to collect results from 300 web pages, while a computer program can complete such a task in just half an hour.

Any questions and comments are welcome.

digg it : add to del.icio.us

Labels: , , , ,

Saturday, June 09, 2007

Engine update: Ask.com, Search.com, Voila

The following search sources in FirstStop WebSearch have been updated today.
  • Ask.com
  • Search.com
  • Voila

Don't forget to check for updates from time to time (menu "File/Update Engines").

Labels: ,