Uwe F. Mayer and Armand Sarkissian
Abstract:
Data mining techniques are routinely used by fundraisers to select
those prospects from a large pool of candidates who are most likely to
make a financial contribution. These techniques often rely on
statistical models based on trial performance data. This trial
performance data is typically obtained by soliciting a smaller sample
of the possible prospect pool. Collecting this trial data involves a
cost; therefore the fundraiser is interested in keeping the trial size
small while still collecting enough data to build a reliable
statistical model that will be used to evaluate the remain-der of the
prospects.
We describe an experimental design approach to optimally choose the
trial prospects from an existing large pool of prospects. Pros-pects
are clustered to render the problem practically tractable. We modify
the standard D-optimality algorithm to prevent repeated selection of
the same prospect cluster, since each prospect can only be solicited
at most once.
We assess the benefits of this approach on the KDD-98 data set by
comparing the performance of the model based on the optimal trial data
set with that of a model based on a randomly selected trial data set
of equal size.
Key words: Experimental design, solicitation campaign, data collection.
You can download a copy of this article (about 5 pages plus references).
Back |