Subject: Information System Science
Title: Selecting an Optimal Stream Processing Tool in an E-commerce Environment
Abstract:
The rapid growth of data volume and velocity in e-commerce has heightened the demand for real-time
analytics and adaptive business strategies. Selecting an optimal stream processing tool is critical, yet
challenging, due to the wide array of available platforms and the complexity of requirements in modern
e-commerce environments. This thesis addresses the gap by applying a structured decision-making framework,
based on the Analytic Hierarchy Process (AHP), to guide e-commerce organizations in evaluating
and selecting stream processing tools aligned with their operational and strategic needs.
The research employs a multi-method case study within an European e-commerce company, combining
qualitative data from stakeholder interviews, documentation analysis, and observations, with quantitative
pairwise comparisons to establish and weight key selection criteria. Six stream processing platforms:
Apache Flink, Apache Spark Structured Streaming, Apache Kafka Streams, Apache Storm, Apache Samza,
and Google Cloud Dataflow are systematically evaluated against criteria such as fault tolerance, performance,
state and event handling, integration, operability, and cost within a dynamic pricing case study. The
findings demonstrate how a criteria-driven methodology can support organizations in making informed and
context-aware technology choices.
Key words: Stream processing, E-commerce, Real-time analytics, Analytic Hierarchy Process
(AHP), Tool selection, Dynamic pricing