Please use this identifier to cite or link to this item: http://hdl.handle.net/2381/28080
Title: Mining Sequential Patterns from Probabilistic Databases
Authors: Muzammal, Muhammad
Raman, Rajeev
First Published: 2011
Presented at: 15th Pacific-Asia Conference, PAKDD 2011, Shenzhen, China
Start Date: 24-May-2011
End Date: 27-May-2011
Publisher: Springer Berlin Heidelberg
Citation: Muzammal, M. & Raman, R. ‘Mining Sequential Patterns from Probabilistic Databases’ in Huang, J.Z., Cao, L. & Srivastava, J. (Eds.) Advances in Knowledge Discovery and Data Mining, LNCS 6635 (© 2011, Springer Berlin Heidelberg), pp. 210-221
Abstract: We consider sequential pattern mining in situations where there is uncertainty about which source an event is associated with. We model this in the probabilistic database framework and consider the problem of enumerating all sequences whose expected support is sufficiently large. Unlike frequent itemset mining in probabilistic databases [C. Aggarwal et al. KDD’09; Chui et al., PAKDD’07; Chui and Kao, PAKDD’08], we use dynamic programming (DP) to compute the probability that a source supports a sequence, and show that this suffices to compute the expected support of a sequential pattern. Next, we embed this DP algorithm into candidate generate-and-test approaches, and explore the pattern lattice both in a breadth-first (similar to GSP) and a depth-first (similar to SPAM) manner. We propose optimizations for efficiently computing the frequent 1-sequences, for re-using previously-computed results through incremental support computation, and for elmiminating candidate sequences without computing their support via probabilistic pruning. Preliminary experiments show that our optimizations are effective in improving the CPU cost.
Series/Report no.: Lecture Notes in Computer Science;6635
DOI Link: 10.1007/978-3-642-20847-8_18
ISSN: 0302-9743
ISBN: 978-3-642-20846-1
Links: http://link.springer.com/chapter/10.1007/978-3-642-20847-8_18
http://hdl.handle.net/2381/28080
Type: Conference Paper
Rights: Copyright © 2011, Springer Berlin Heidelberg. Deposited with reference to the publisher’s archiving policy. The final publication is available at link.springer.com.
Description: Full text of this item is not currently available on the LRA. The final published version may be available through the links above.
Appears in Collections:Conference Papers & Presentations, Dept. of Computer Science

Files in This Item:
There are no files associated with this item.


Items in LRA are protected by copyright, with all rights reserved, unless otherwise indicated.