Thesis Väinö Saarinen


The amount of data in the world is constantly increasing, making data management more complex and demanding. Effective data utilization is crucial for in-depth analysis, logical reasoning, and decision-making processes. Data labelling is an essential part of this process, but it has traditionally been labour-intensive and resource-consuming. To manage always scarce resources more efficiently, companies are turning to data labelling tools to automate the process, enhance data management, and extract more value from their data.

This thesis aims to reason the benefits and risks associated with implementing a data labelling tool, specifically Microsoft Purview. The study employs a benefit measurement model and includes a pilot project conducted in a case company. Additionally, interviews with company professionals were conducted to provide further validation and professional insights into the benefits of data labelling.

The findings reveal several notable benefits of data labelling and data labelling tools. Firstly, labelling tools improve the quality and understanding of the data in hand, enhancing its utility. Secondly, automated labelling tools significantly accelerate the labelling process, reducing resource consumption compared to manual methods. Thirdly, data labelling offers broad advantages in data management, data governance, data loss prevention, data security and compliance management and data lifecycle management. Risks related to data labelling tool implementation includes accuracy of labelling, user adoption and engagement and beneficial resource allocation.

Leave a Reply

Your email address will not be published. Required fields are marked *