Explore projects
-
Updated
-
QuestionMark: the probabilistic benchmark is the Python program to benchmark any probabilistic database management system.
This project is written by Nikki Zandbergen as part of her M.Sc. Computer Science thesis at the University of Twente. This project was supervised by Maurice van Keulen, Tom van Dijk and Jan Flokstra.
To run this benchmark, a dataset should be generated with QuestionMark: The Dataset Generator.
Updated -
Additive manufacturing has recently seen substantial growth, yet consistently producing high-quality parts remains a challenge. Recoating streaking is a common anomaly that impairs print quality. Several data-driven models for automatically detecting this anomaly have been proposed, each with varying effectiveness. However, comprehensive comparisons among them is lacking. Additionally, these models are often tailored to specific datasets. This research addresses this gap by implementing and comparing these anomaly detection models for recoating streaking in a reproducible way. We offer a clearer, more objective evaluation of their performance, strengths, and weaknesses. Furthermore, we propose an improvement to the Line Profiles detection model to broaden its applicability, and a novel preprocessing step was introduced to enhance the models' performances. These improvements established the Line Profiles model as the most efficient detection approach in our benchmark dataset.
Updated -
-
Updated
-
QuestionMark: The Dataset Generator is a Python program to create a dataset for probabilistic product matching. This dataset is required to run the benchmark test with QuestionMark: The Probabilistic Benchmark.
This project is written by Nikki Zandbergen as part of her M.Sc. Computer Science thesis at the University of Twente. This project was supervised by Maurice van Keulen, Tom van Dijk and Jan Flokstra.
The dataset created by this program is an adaptation of the WDC Product Data Corpus for Large-Scale Product Matching dataset. The clustering provided by this original dataset is removed and a new probabilistic clustering is introduced.
Updated