TY - JOUR TI - Performance Improvement (Pi) score: an algorithm to score Pi objectively during E-BLUS hands-on training sessions. A European Association of Urology, Section of Uro-Technology (ESUT) project AU - Veneziano, D. AU - Canova, A. AU - Arnolds, M. AU - Beatty, J.D. AU - Biyani, C.S. AU - Dehò, F. AU - Fiori, C. AU - Hellawell, G.O. AU - Langenhuijsen, J.F. AU - Pini, G. AU - Rodriguez Faba, O. AU - Siena, G. AU - Skolarikos, A. AU - Tokas, T. AU - Van Cleynenbreugel, B.S.E.P. AU - Wagner, C. AU - Tripepi, G. AU - Somani, B. AU - Lima, B. JO - BJU international (Papier) PY - 2019 VL - 123 TODO - 4 SP - 726-732 PB - Wiley-Blackwell Publishing Ltd SN - null TODO - 10.1111/bju.14621 TODO - adult; algorithm; Article; clinical effectiveness; comparative study; data collection method; Europe; female; human; interrater reliability; male; medical research; patient assessment; performance improvement score; priority journal; surgical training; urogenital tract disease assessment; algorithm; clinical competence; depth perception; education; hemispheric dominance; laparoscopy; medical education; reproducibility; task performance; urology; videorecording, Algorithms; Clinical Competence; Depth Perception; Educational Measurement; Functional Laterality; Humans; Internship and Residency; Laparoscopy; Reproducibility of Results; Task Performance and Analysis; Urology; Video Recording TODO - Objective: To evaluate the variability of subjective tutor performance improvement (Pi) assessment and to compare it with a novel measurement algorithm: the Pi score. Materials and Methods: The Pi-score algorithm considers time measurement and number of errors from two different repetitions (first and fifth) of the same training task and compares them to the relative task goals, to produce an objective score. We collected data during eight courses on the four European Association of Urology training in Basic Laparoscopic Urological Skills (E-BLUS) tasks. The same tutor instructed on all courses. Collected data were independently analysed by 14 hands-on training experts for Pi assessment. Their subjective Pi assessments were compared for inter-rater reliability. The average per-participant subjective scores from all 14 proctors were then compared with the objective Pi-score algorithm results. Cohen's κ statistic was used for comparison analysis. Results: A total of 50 participants were enrolled. Concordance found between the 14 proctors' scores was the following: Task 1, κ = 0.42 (moderate); Task 2, κ = 0.27 (fair); Task 3, κ = 0.32 (fair); and Task 4, κ = 0.55 (moderate). Concordance between Pi-score results and proctor average scores per participant was the following: Task 1, κ = 0.85 (almost perfect); Task 2, κ = 0.46 (moderate); Task 3, κ = 0.92 (almost perfect); Task 4 = 0.65 (substantial). Conclusion: The present study shows that evaluation of Pi is highly variable, even when formulated by a cohort of experts. Our algorithm successfully provided an objective score that was equal to the average Pi assessment of a cohort of experts, in relation to a small amount of training attempts. © 2018 The Authors BJU International © 2018 BJU International Published by John Wiley & Sons Ltd ER -