Please use this identifier to cite or link to this item:
Title: How to Make Best Use of Cross-Company Data for Web Effort Estimation?
Authors: Minku, Leandro Lei
Sarro, Federica
Mendes, Emilia
Ferrucci, Filomena
First Published: Oct-2015
Presented at: Beijing
Start Date: 22-Oct-2015
End Date: 23-Oct-2015
Publisher: ACM, IEEE
Citation: Proceedings of the 9th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pp. 172-181
Abstract: [Context]: The numerous challenges that can hinder software companies from gathering their own data have motivated over the past 15 years research on the use of cross-company (CC) datasets for software effort prediction. Part of this research focused on Web effort prediction, given the large increase worldwide in the development of Web applications. Some of these studies indicate that it may be possible to achieve better performance using CC models if some strategy to make the CC data more similar to the within-company (WC) data is adopted. [Goal]: This study investigates the use of a recently proposed approach called Dycom to assess to what extent Web effort predictions obtained using CC datasets are effective in relation to the predictions obtained using WC data when explicitly mapping the CC models to the WC context. [Method]: Data on 125 Web projects from eight different companies part of the Tukutuku database were used to build prediction models. We benchmarked these models against baseline models (mean and median effort) and a WC base learner that does not benefit of the mapping. We also compared Dycom against a competitive CC approach from the literature (NN-filtering). We report a company-by- company analysis. [Results]: Dycom usually managed to achieve similar or better performance than a WC model while using only half of the WC training data. These results are also an improvement over previous studies that investigated the use of different strategies to adapt CC models to the WC data for Web effort estimation. [Conclusions]: We conclude that the use of Dycom for Web effort prediction is quite promising and in general supports previous results when applying Dycom to conventional software datasets.
DOI Link: 10.1109/ESEM.2015.7321199
ISBN: 978-1-4673-7899-4
Version: Post-print
Status: Peer-reviewed
Type: Conference Paper
Rights: Copyright © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Description: Archived in accordance with the publisher's posting policy, available at
Appears in Collections:Conference Papers & Presentations, Dept. of Computer Science

Files in This Item:
File Description SizeFormat 
MinkuESEM2015.pdfPost-review (final submitted)251.17 kBAdobe PDFView/Open
best-paper-award-ESEM2015.pdf179.1 kBAdobe PDFView/Open

Items in LRA are protected by copyright, with all rights reserved, unless otherwise indicated.