INTRODUCTIONSUPPORT, FIRST INTRODUCED IN [6]. CHUI ET. AL. [5, 6] T...

1. INTRODUCTION

support, first introduced in [6]. Chui et. al. [5, 6] take

the uncertainty of items into account by computing the ex-

Association rule analysis is one of the most important

pected support of itemsets. Itemsets are considered frequent

fields in data mining. It is commonly applied to market-

if the expected support exceeds minSup. Effectively, this

basket databases for analysis of consumer purchasing be-

approach returns an estimate of whether an object is fre-

haviour. Such databases consist of a set of transactions,

quent or not with no indication of how good this estimate

is. Since uncertain transaction databases yield uncertainty

w.r.t. the support of an itemset, the probability distribution

of the support and, thus, information about the confidence

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

of the support of an itemset is very important. This in-

not made or distributed for profit or commercial advantage and that copies

formation, while present in the database, is lost using the

bear this notice and the full citation on the first page. To copy otherwise, to

expected support approach.

republish, to post on servers or to redistribute to lists, requires prior specific

permission and/or a fee.

Example 1. Consider a department store. To maximize

KDD’09, June 28–July 1, 2009, Paris, France.

sales, customers can be analysed to find sets of items that are

Copyright 2009 ACM 978-1-60558-495-9/09/06 ...$5.00.

ID Transaction

World TransactionDB Prob.

Customer Item Prob.

t

1

(A, 0.8) ; (B, 0.2) ; (D, 0.5) ; (F, 1.0)

1

{Game} ; {}

0.144

A Game 1.0

t

2

(B, 0.1) ; (C, 0.7) ; (D, 1.0) ; (E, 1.0) ; (G, 0.1)

2

{Game, Music} ; {}

0.036

A Music 0.2

t

3

(A, 0.5) ; (D, 0.2) ; (F, 0.5) ; (G, 1.0)

3

{Game} ; {Video}

0.096

B Video 0.4

t

4

(D, 0.8) ; (E, 0.2) ; (G, 0.9)

4

{Game, Music} ; {Video}

0.024

B Music 0.7

t

5

(C, 1.0) ; (D, 0.5) ; (F, 0.8) ; (G, 1.0)

5

{Game} ; {Music}

0.336

6

{Game, Music} ; {Music}

0.084

t

6

(A, 1.0) ; (B, 0.2) ; (C, 0.1)

7

{Game} ; {Video, Music}

0.224

t

A

(Game, 1.0) ; (Music, 0.2)

0.056

8

{Game, Music} ;

Figure 2: Example of a larger uncertain transaction

t

B

(Video, 0.4) ; (Music, 0.7)

{Video, Music}

database.

(b) Corresponding

(a) Uncertain Trans-

possible worlds

action Database

worlds derived from Figure 1(a). For example, in world 6

both customers bought music, customer B decided against

Figure 1: Example application of an uncertain trans-

a new video and customer A bought a new game.

action database.

We assume that uncertain transactions are mutually inde-

pendent. Thus, the decision by customer A has no influence

all purchased by a large group of customers. This informa-

on customer B. This assumption is reasonable in real world

tion can be used for advertising directed at this group. For

applications. Additionally, independence between items is

example, by providing special offers that include all of these

often assumed in the literature [5, 6]. This can be justi-

items along with new products, the store can encourage new

fied by the assumption that the items are observed indepen-

purchases. Figure 1(a) shows such customer information.

dently. In this case, the probability of a world w is given

Here, customer A purchases games every time he visits the

by:

store and music (CDs) 20% of the time. Customer B buys

P(x ∈ t) ∗ Y

P (w) = Y

( Y

(1 − P (x ∈ t)))

music in 70% of her visits and videos (DVDs) in 40% of

x∈t

t∈I

them. The supermarket uses a database that represents each

x / ∈t

For example, the probability of world 5 in Figure 1(b) is

customer as a single uncertain transaction, also shown in

P (Game ∈ t A ) ∗ (1 − P (M usic ∈ t A )) ∗ P (M usic ∈ t B ) ∗

Figure 1(a).

(1 − P (V ideo ∈ t B )) = 1.0 ∗ 0.8 ∗ 0.7 ∗ 0.6 = 0.336.