Estimating Omissions from Searches

Statistical Physics and Complexity Group meeting

Estimating Omissions from Searches

  • Event time: 11:30am
  • Event date: 19th August 2011
  • Speaker: Anthony J. Webster (University of Edinburgh)
  • Location: Room 2511,

Event details

We consider the generic problem of estimating how many items have been missed when two or more people search for them in some finite set. The problem appears in a wide variety of contexts including estimating the number of typographical errors in a paper, or the effectiveness of database searches and algorithms. The Lincoln-Peterson estimator provides the simplest estimate, and was developed independently by Peterson in 1896 for the purpose of estimating the number of fish immigrating from the German sea to the Limfjord, and by Lincoln in 1930 for the estimation of Waterfowl abundance. Following a more rigorous treatment by Chapman in 1951 the technique has grown in popularity, with a rapidly growing literature, especially in the context of ecological census techniques.

This talk will informally discuss how the estimate is arrived at and how by using Bayes theorem the estimate can be rigorously derived, and the estimate's accuracy quantified. The resulting probability distribution can be approximated in a number of ways, the most accurate of which will briefly be discussed. Finally it is shown how the method can be extended to an arbitrary number of people searching, providing a general solution and a simple method for its implementation that could easily be incorporated into a piece of software for example.