Combining Artificial Intelligence, Ontology, and Frequency-based Approaches to Recommend Activities in Scientific Workflows

Adilson Lopes Khouri, Luciano Antonio Digiampietri

Abstract


The number of activities provided by scientific workflow management systems is large, which requires scientists to know many of them to take advantage of the reusability of these systems. To minimize this problem, the literature presents some techniques to recommend activities during the scientific workflow construction. In this paper we specified and developed a hybrid activity recommendation system considering information on frequency, input and outputs of activities and ontological annotations. Additionally, this paper presents a modeling of activities recommendation as a classification problem, tested using 5 classifiers; 5 regressors; and a composite approach which uses a Support Vector Machine (SVM) classifier, combining the results of other classifiers and regressors to recommend; and Rotation Forest, an ensemble of classifiers. The proposed technique was compared to related techniques and to classifiers and regressors, using 10-fold-cross-validation, achieving a Mean Reciprocal Rank (MRR) at least 70% greater than those obtained by classical techniques.


Keywords


recommendation system, scientific workflows, artificial intelligence, ontology

Full Text:

PDF

References


WANG, F. et al. A survey on scientific workflow techniques for escience in astronomy. In: QIHAI, Z. (Ed.). 2010 International Forum on Information Technology and Applications. Kunming, China: IEEE, 2010. v. 1.

AALST, W. M. P. van der. Process Mining - Discovery, Conformance and Enhancement of Business Processes. 1. ed. Berlin: Springer, 2011. v. 1.

FEI, X.; LU, S. A dataflow-based scientific workflow composition framework. IEEE Transactions on Services Computing, v. 5, n. 1, p. 45–58, 2012.

KHOURI, A.; DIGIAMPIETRI, L. A. A systematic review about activities recommendation in workflows. In: RICCIO, E. L. et al. (Ed.). 12a Confereˆncia Internacional sobre Sistemas de Informac ̧a ̃o e Gesta ̃o de Tecnologia (CONTECSI). Sa ̃o Paulo, Brasil: CONTECSI, 2015. v. 1.

SHAO, Q.; KINSY, M.; CHEN, Y. Storing and discovering critical workflows from log in scientific exploration. In: ZHANG, T. J. L.-J.; YANG, J.; HUNG, P. C. K. (Ed.). 2007 IEEE Congress on Services (Services 2007). Salt Lake city, USA: IEEE, 2007. v. 1.

SHAO, Q.; SUN, P.; CHEN, Y. Efficiently discovering critical workflows in scientific explorations. Future Generation Computer Systems, v. 25, n. 5, p. 577–585, 2009.

OLIVEIRA, F. T. d. et al. Improving workflow design by mining reusable tasks. Journal of the Brazilian Computer Society, Journal of the Brazilian Computer Society, v. 21, n. 1, p. 16, 2015.

KOOP, D. Viscomplete: Automating suggestions for visualization pipelines. IEEE Transactions on Visualization and Computer Graphics, v. 14, n. 6, p. 1691–1698, 2008.

OLIVEIRA, F. T. de et al. Using provenance to improve workflow design. In: FREIRE, J.; KOOP, D.; MOREAU, L. (Ed.). Provenance and Annotation of Data and Processes. 1. ed. Salt Lake City, USA: Springer Berlin Heidelberg, 2008, (Lecture Notes in Computer Science, v. 5272). cap. 1, p. 136–143.

WANG, Y.; CAO, J.; LI, M. Change sequence mining in context-aware scientific workflow. In: LIAO, X. (Ed.). 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications. Chengdu, China: IEEE, 2009. v.1.

ZHANG, J.; LIU, Q.; XU, K. FlowRecommender: A Workflow Recommendation Technique for Process Provenance. 2009.

TAN, W. et al. Providing map and gps assistance to service composition in bioinformatics. In: JACOBSEN, H.-A.; WANG, Y.; HUNG, P. (Ed.). 2011 IEEE International Conference on Services Computing. Honolulu, USA: IEEE, 2011. v. 1.

CAO, B. et al. Graph-based workflow recommendation: on improving business process modeling. In: CHEN, X. (Ed.). Proceedings of the 21st ACM international conference on Information and knowledge management. Maui, USA: ACM, 2012. (CIKM ’12, v. 1).

DIAMANTINI, C.; POTENA, D.; STORTI, E. Mining usage patterns from a repository of scientific workflows. In: OSSOWSKI, S.; LECCA, P. (Ed.). Proceedings of the 27th Annual {ACM} Symposium on Applied Computing. Trento, Italy: ACM, 2012. (SAC ’12, v. 1).

GARIJO, D.; CORCHO, O.; GIL, Y. Detecting common scientific workflow fragments using templates and execution provenance. In: BENJAMINS, R. (Ed.). Proceedings of the Seventh International Conference on Knowledge Capture. New York, NY, USA: ACM, 2013. (K-CAP ’13, v. 1).

YEO, P.; ABIDI, S. S. R. Dataflow oriented similarity matching for scientific workflows. In: HERBORDT, M.; WEEMS, C. (Ed.). 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum. Cambridge, USA: IEEE, 2013. v. 1.

BOMFIM, E. et al. Thoth: improving experiences reuses in the scientific environment through workflow management system. In: LIN, Z. (Ed.). Computer Supported Cooperative Work in Design, 2005. Proceedings of the Ninth International Conference. Coventry, UK: IEEE, 2005. v. 2.

WANG, J. et al. Vinca4science: A personal workflow system for e-science. In: NI, J. (Ed.). ICICSE ’08. International Conference on Internet Computing in Science and Engineering. Harbin, China: IEEE, 2008. v. 1.

LENG, Y.; EL-GAYYAR, M.; CREMERS, A. B. Semantics enhanced composition planner for distributed resources. In: QINGPING, G.; YUCHENG, G. (Ed.). 2010 Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science. Hong Kong, China: IEEE, 2010. v. 1.

YAO, J. et al. Reputationnet: A reputation engine to enhance servicemap by recommending trusted services. In: MOSER, L.; PARASHAR, M.; HUNG, P. (Ed.). IEEE Ninth International Conference on Services Computing (SCC). Honolulu, USA: IEEE, 2012. v. 1.

TELEA, A.; WIJK, J. J. van. Vission: An object oriented dataflow system for simulation and visualization. In: GRo ̈LLER, E.; Lo ̈FFELMANN, H.; RIBARSKY, W. (Ed.). Proceedings on IEEE vissym. vienna, austria: Springer, 1999. (Eurographics, v. 1).

OLIVEIRA, F. T. de. Um Sistema De Recomendac ̧a ̃o Para Composic ̧a ̃o de Workflows. Dissertac ̧a ̃o (Mestrado) — Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil, 2010.

ZHANG, J. et al. Recommend-as-you-go: A novel approach supporting services-oriented scientific workflow reuse. In: JACOBSEN, H.-A.; WANG, Y.; HUNG, P. (Ed.). 2011 IEEE International Conference on Services Computing. Whashington, DC, USA: IEEE, 2011. v. 1.

GARIJO, D. et al. Workflow reuse in practice: A study of neuroimaging pipeline users. Proceedings of the 2014 IEEE 10th International Conference on eScience, v. 1, n. 1, p. 239–246, 2014.

SOOMRO, K.; MUNIR, K.; MCCLATCHEY, R. Incorporating semantics in pattern-based scientific workflow recommender systems. Science and Information Conference (SAI), v. 1, n. 1, p. 565–571, 2015.

HARVEY, M.; RUTHVEN, I.; CARMAN, M. Ranking social bookmarks using topic models. Proceedings of the 19th ACM International Conference on Information and Knowledge Management, v. 1, n. 1, p. 1401–1404, 2010.

USCHOLD, M.; KING, M. Towards a methodology for building ontologies. In: AHMED, A. et al. (Ed.). In Workshop on Basic Ontological Issues in Knowledge Sharing, held in conjunction with IJCAI-95. New York, USA: ACM, 1995. v. 39.




DOI: http://dx.doi.org/10.22456/2175-2745.75048

Copyright (c) 2018 Adilson Lopes Khouri, Luciano Antonio Digiampietri

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.