Please use this identifier to cite or link to this item:
Title: Mining sequential patterns from probabilistic databases
Authors: Muzammal, Muhammad
Raman, Rajeev
First Published: 24-Jul-2014
Publisher: Springer Verlag
Citation: Knowledge and Information Systems, 2014.
Abstract: This paper considers the problem of sequential pattern mining (SPM) in probabilistic databases. Specifically, we consider SPM in situations where there is uncertainty in associating an event with a source, model this kind of uncertainty in the probabilistic database framework and consider the problem of enumerating all sequences whose expected support is sufficiently large. We give an algorithm based on dynamic programming to compute the expected support of a sequential pattern. Next, we propose three algorithms for mining sequential patterns from probabilistic databases. The first two algorithms are based on the candidate generation framework – one each based on a breadth-first (similar to GSP) and a depth-first (similar to SPAM) exploration of the search space. The third one is based on the pattern growth framework (similar to PrefixSpan). We propose optimizations that mitigate the effects of the expensive dynamic programming computation step. We give an empirical evaluation of the probabilistic SPM algorithms and the optimizations, and demonstrate the scalability of the algorithms in terms of CPU time and the memory usage. We also demonstrate the effectiveness of the probabilistic SPM framework in extracting meaningful sequences in the presence of noise.
DOI Link: 10.1007/s10115-014-0766-7
ISSN: 0219-1377
Version: Post-print
Status: Peer-reviewed
Type: Journal Article
Rights: Copyright © 2014, Springer Verlag. Deposited with reference to the publisher’s archiving policy available on the SHERPA/RoMEO website.
Description: The file associated with this record is embargoed until 12 months after the date of publication. The final published version may be available through the links above.
Appears in Collections:Published Articles, Dept. of Computer Science

Files in This Item:
File Description SizeFormat 
pakdd-journal.pdfPost-review (final submitted)466.25 kBAdobe PDFView/Open

Items in LRA are protected by copyright, with all rights reserved, unless otherwise indicated.