IWA Publishing
 IWA Publishing Journals   Subscriptions   Authors   Users   Librarians   FAQs 

Journal of Water and Health Vol 5 No 4 pp 503–509 © US Government 2007 doi:10.2166/wh.2007.044

A statistical appraisal of disproportional versus proportional microbial source tracking libraries

Brian J. Robinson, Kerry J. Ritter and R. D. Ellender

National Oceanic and Atmospheric Administration, 219 Fort Johnson Road, Charleston, SC 29412-9110, USA Tel: +1 843 762-8572Fax: +1 843 762-8700brianjrobinson1979@yahoo.com
Southern California Coastal Water Research Project, 7171 Fenwick Lane, Westminster, CA 92683, USA
The University of Southern Mississippi, Department of Biological Sciences, 118 College Drive #5018, Hattiesburg, MS 39406-0001, USA


ABSTRACT

Library-based microbial source tracking (MST) can assist in reducing or eliminating fecal pollution in waters by predicting sources of fecal-associated bacteria. Library-based MST relies on an assembly of genetic or phenotypic “fingerprints” from pollution-indicative bacteria cultivated from known sources to compare with and identify fingerprints of unknown origin. The success of the library-based approach depends on how well each source candidate is represented in the library and which statistical algorithm or matching criterion is used to match unknowns. Because known source libraries are often built based on convenience or cost, some library sources may contain more representation than others. Depending on the statistical algorithm or matching criteria, predictions may become severely biased toward classifying unknowns into the library's dominant source category. We examined prediction bias for four of the most commonly used statistical matching algorithms in library-based MST when applied to disproportionately-represented known source libraries; maximum similarity (MS), average similarity (AS), discriminant analyses (DA), and k-means nearest neighbor (k-NN). MS was particularly sensitive to disproportionate source representation. AS and DA were more robust. k-NN provided a compromise between correct prediction and sensitivity to disproportional libraries including increased matching success and stability that should be considered when matching to disproportionally-represented libraries.

Keywords: disproportional; library; microbial source tracking; proportional; statistics


Full article (PDF Format)


PAY-PER-VIEW: Buy this article for £20.00 (IWA MEMBER PRICE: £15.00)
Checkout