Many commercial software programs can handle all but the most esoteric statistical tests with simple clicks on a computer keyboard. There are even YouTube videos that show which buttons to click and when. Unfortunately, there is little emphasis on the reasoning that must be applied when choosing among the array of procedures that might be appropriate to address a study’s specific research objectives, and there is almost no mention of the importance of quality data collection and management. This month’s Statistics in small doses draws on actual examples from a recent review of study protocols and papers, and serves as a recap of previous “doses.” Some attempts have been made to protect the innocent:

Hot spot

Why is this a problem?

Example(s)

Typographical errors

Causes delays in analysis because research personnel must go back to source files to correct

Some typos are obvious (“*” for “8”), however, even typos that seem obvious (“55” when values can only range from 0 to 10) must be double-checked. As another example, the statistician or data analyst cannot assume that “93” is a “9” or a “3” – especially since the column of numbers on the R-hand key pad includes the vector “9-6-3.”

Collecting ‘unnecessary’ data

Wastes patient and research personnel time and effort

Postoperative pain at rest might be collected at 36h, 48h, 60h, 72h, 84h, 96h, and at 7d when only 48h, 72h, 96h, and 7d data are needed to address the research question(s). As another example, the Modified Brief Inventory can be used at pre-specified time points in a study to ask patients (q5): “Please rate your pain by marking the number {0-10} that best describes your pain at its WORST in the last 24 hours.” The investigator may therefore decide to include questions asking for current postoperative pain scores (at rest and with movement) at the same pre-specified time points. Are all of these pain scores really necessary to the study?

Conversely, not collecting enough data

Lacks information needed to substantiate research findings

It is well-known that the performance of ultrasound-guided regional anesthesia requires a lot of operator expertise, so it may be prudent to collect some additional data supporting block success (e.g., acceptable sensory-motor scores within a specified time). This helps to ensure the readership that outcome data resulted from successful blocks and not from a combination of successful and less successful blocks. This also applies to the “pop” or landmark-feel method of needle placement. Moreover, patients can have their vagaries, so while vomiting can be observed, feeling nauseated is by ‘patient report.’ Here, it might be prudent to collect both variables separately.