Observation Accuracy Experiment (0.1)

2024년 01월 31일 (수)

Validators: julia_shner, apseregin, chia, R. Richter, hcoste, S. Wolkenberg, Н. Гамова, D. Bochkov, N. Schwab, mmmmbugs, S. Loarie, L. Harper, S. Rall, vitorcdg, enricotosto96, J. Orfão, O. Clarkin, plantperson7654, A. Rockefeller, steve_mcwilliam, (867 additional validators) (hide additional validators)

Abstract: We generated a sample of observations on 2024년 01월 16일 (화) and coordinated a team of validators to assess their accuracy by 2024년 01월 31일 (수). All results shown below show characteristics of the sample observations at the time the sample was generated (for example Research Grade or not). The accuracy of the sample observations was assessed on 2024년 01월 31일 (수). See the methods tab for more details.

Step 1: Sample Generation and Assignment

From the iNaturalist database of observations (A), we generated a random sample (B). Subsamples were assigned to candidate validators based on the criteria described below. We aimed to assign each sample observation to multiple candidate validators ( "Target validator redundancy" ).

Sample generated: 2024년 01월 16일 (화)
Sample size: 1000
Target validator redundancy: 5
Number of candidate validators: 1232
Mean subsample size per candidate validator: 6
Mean number of validators per sample observation: 4

Validator candidate criteria

If an identifier had made at least 3 improving identifications on a taxon, we considered them qualified to validate that taxon. Improving identifications are the first suggestion of a taxon that the community subsequently agrees with.

For example, Identifier 1 adds an ID of A to an observation. If Identifier 2 later adds a leading ID of B, Identifier 1's ID on A becomes an improving ID. If Identifier 3 later adds a supporting ID to B, Identifier 2's ID on B becomes an improving ID.

Step 2: Validating samples

We contacted candidate validators and asked them to add identifications to their subsamples. The graph to the right groups sample observation by the number of validators.

Mean number of validators per sample observation: 4
Number of participating validators: 887 (72%)
Percent of sample validated: 96%
500 400 300 200 100 0 주기 0 1 2 3-4 >4

Step 3: Scoring sample observations from validations

We scored observations in the sample as Correct, Uncertain, or Incorrect by comparing the validator identifications to the each sample observation’s taxon. Identifications matching or finer thant the sample’s observation taxon were scored as Correct. Non-disagreeing coarser identifications were scored as Uncertain as were scenarios where multiple validators disagreed. Disagreeing coarser identifications or identifications on different branches were scored as Incorrect.

Correct
Uncertain
Incorrect

Step 4: Accuracy and Precision calculation

Accuracy

We calculated Accuracy as the percent correct for the entire sample as well as subsets such as the Research Grade subset. We also calculated the percent Uncertain and the percent Incorrect. In the example figure to the right, the percentage correct would be 83, the percentage uncertain would be 6 and the percentage incorrect would be 11.

Precision

If every sample observation was at a coarse taxon such as kingdom, the accuracy of the iNaturalist dataset might be very high, but the precision might be too low to make the dataset very useful. To estimate precision, we counted the number of leaf descendants in the iNaturalist taxonomy for each taxon associated with each sample observation. We calculated precision as 1 divided by the number of leaf descendants. If the taxon was a leaf taxon the precision is 100%. We ignored ranks below species in the precision calculation.

In the example figure to the right the average precision of three sample observations would be 51%.