You’re working with a performance dataset from a rapidly growing digital platform that serves millions of users across different regions and device types. The dataset captures two core numerical metrics for every user session: processing time and resource consumption. These two variables often move together, but not always and the moments when they don’t align usually indicate deeper issues such as capacity overload, inefficient requests, or poorly optimized devices.
As you explore the dataset, you notice that summary statistics alone can’t give you the clarity you need. The averages look normal, the percentiles look acceptable, yet some users are still reporting unexpected slowdowns. When you dig deeper, it becomes clear that the problematic behaviour only emerges when both numerical variables are analysed together. Patterns don’t show up in isolation; they show up in the relationship between the two.
