Statistical misconduct is often detectable by considering the thought process behind the researcher. Those attempting to falsify data(e.g. participants) may wish to now draw attention to their created data. Therefore, falsified participants tend to be close to the mean for the majority of the variables.
Because fabricated data tends to display less variation than expected, they can be detected by analyzing how far their values are from the average of the variables compared to the rest of the dataset. Created participants should have relatively small deviations from the average for each variable compared to other participants.
A created data point can be seen graphically as an "inlier," whose summed deviations from the mean are smaller than expected.
This calculator allows you to examine whether any participants' response in a dataset are to close to the mean of the variable to have originated in a random sample.
- Enter your entire dataset from a study. (Sample data is provided)
- Data must be a comma-separated form.
- Each row represents a unique participant.
- Each column represents a unique variable.
- Choose what criteria you would like to use for evaluating whether or not a point is unusual.
- Choose how you would like the data to be scaled.
- When all of the information is entered, press 'Run' to begin the analysis.
- Results will appear in the table and graph at the bottom.
[+] Important Notes
When using this calculator, remember that all tools can be used for the benefit of others and also for harm. Therefore, when using them, we should take precautions to avoid the negative consequences. While it is important to be able to detect research misconduct, it is just as important to prevent innocent researchers from accusations of wrongdoing.
Make sure to also take the following measures during any investigation of potential fraud:
- Replicate analyses across multiple papers before suspecting foul play by a given author
- Compare suspected studies to similar ones
- Extend analyses to raw data
- Contact authors privately, transparently, and give them ample time to consider your concerns
- Offer to discuss matters with a trusted statistically savvy advisor
- Give the authors more time
- If after all this suspicions remains, convey them only to entities tasked with investigating such matters, and do so as discreetly as possible.
Threshold for Unusual Values
Log transform the distance values on the scatterplot?