SEA4DQ Workshop 2022
Schedule, accepted papers and keynotes
About the workshop
Cyber-physical systems (CPS) have been developed in many industrial sectors and application domains in which the quality of the data acquired and used for decision support is a common factor. Data quality can deteriorate due to factors such as sensor faults and failures due to operating in harsh and uncertain environments.
The aim for the upcoming workshop is to investigate how software engineering and artificial intelligence (AI) can help manage and tame data quality issues in CPS.
This is the question we aim to investigate in this workshop SEA4DQ. Emerging trends in software engineering need to take data quality management seriously as CPS are increasingly data-centric in their approach to acquiring and processing data along the edge-fog-cloud continuum.
This workshop will provide researchers and practitioners a forum for exchanging ideas, experiences, understanding of the problems, visions for the future, and promising solutions to the problems in data quality in CPS.
For more information about the schedule visit: SEA4DQ 2022 – ESEC/FSE 2022 (esec-fse.org)
SEA4DQ 2022 has accepted the following contributions:
- Data Quality as a Microservice – an ontology and rule based approach for quality assurance of sensor data in manufacturing machines | Full Paper
Jørgen Stang, Dirk Walther, Per Myrseth
- Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study | Full Paper
Muhammad Azmi Umer, Aditya Mathur and Muhammad Taha Jilani
- Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline | WIP Paper
Valentina Golendukhina, Harald Foidl, Michael Felderer and Rudolf Ramler
- Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production | Short Paper
Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Rune Henriksen, Arianeh Aamodt, Dumitru Roman
- Data Quality Issues in Solar Panels Installations: A Case Study | Short Paper
Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, Alexander G.Ulyashin
Prof. Dr. Andreas Metzger
Head of Adaptive Systems and Big Data Applications,
University of Duisburg-Essen, Germany
Title: “Data Quality Issues in Online Reinforcement Learning for Self-adaptive Systems”
A self-adaptive system can modify its structure and behavior at runtime based on its perception of the environment, itself, and its requirements. By adapting itself at runtime, the system can maintain its requirements in the presence of dynamic environment changes. Examples are elastic cloud systems, intelligent IoT systems as well as proactive process management systems. One key element of a self-adaptive system is its self-adaptation logic, which encodes when and how the system should adapt itself. When developing the adaptation logic, developers face the challenge of design time uncertainty. This means they have to anticipate potential environment states and the precise effect of adaptation in a given environment state, while the knowledge available at design time may not be sufficient to do so. A recent industrial survey determined design-time uncertainty as one of the most frequently observed difficulties in designing self-adaptation logic in practice. This talk will explore the opportunities but also challenges that modern machine learning algorithms offer in building the self-adaptation logic in the presence of design-time uncertainty. It will focus on online reinforcement learning as an emerging approach, which means that during operation the system learns from interactions with its environment, thereby effectively leveraging data only available at run time. In particular, the talk will focus on three different issues related to data quality and will introduce initial solutions for these issues: (1) data non-stationarity, (2) data sparsity, and (3) data intransparency. The talk will close with a critical discussion of limitations and an outlook on future research opportunities.
Prof. Foutse Khomh
Head of SoftWare Analytics and Technologies (SWAT) Lab,
University of Montréal, Canada
Title: “Data Quality and Model Under-Specification Issues”
Nowadays, we are witnessing an increasing demand in both industry and academia for exploiting Deep Learning (DL) to solve complex real-world problems. However, the performance of these high-capacity learners is currently bounded by the quality and volume of their underlying training data. The use of incomplete, erroneous, or inappropriate training data, and the implementation of poor data management practices in a training pipeline often result into unreliable, biased, or under specified models. In this talk, I will report about some recent research works that we have conducted to identify best practices of data management for DL. I will also report about recent techniques and tools that we have developed to help detect the root cause of model under-specification issues early on during a DL training process.
SEA4DQ Team / Contacts
Learn more: https://sea4dq.github.io/
Phu Nguyen, SINTEF, Norway (Main Contact)
Sagar Sen, SINTEF, Norway (Main Contact)
Maria Chiara Magnanini, Politecnico di Milano, Italy
Mikel Armendia, Tekniker, Spain
Beatriz Cassoli, TU Darmstadt, Germany
Nicolas Jourdan, TU Darmstadt, Germany