Virtual Desktop Infrastructure (VDI) capacity planning guidance. Using historical hypervisor performance monitoring data stored in HDFS, we have analysed the performance of VDI services deployed on large clusters at a major international financial firm. By the end of the engagement that involved a full data analysis process - data cleaning, exploratory data analysis, model building, confirmatory data analysis - we created probabilistic utilization and QoS failure models. We also showed the viability of clustering - effectively categorizing - virtual machines/users by resource usage behaviour in the specific environment.