Task Details

Submitted solutions will be unpacked and reproduced using ReproUnzip on an evaluation server (Azure Standard F32s_v2) with the following characteristics:

Processor 32 CPU x 2.7 GHz
Main Memory 64 GB
Storage 32 GB
Operating System Ubuntu 20.04.5 LTS

In particular, before running ReproUnzip, the dummy-data.bin dataset (i.e., the original input you worked on) will be replaced with the sceret evaluation dataset evaluate-data.bin, which contains 1M vectors. ( We will use a larger evaluation dataset (10M vectors) after March 10 for evaluation. )

Here is the detailed sequence of operations used for the evaluation process:

  • reprounzip docker setup <bundle> <solution>, to unpack the uploaded bundle
  • reprounzip docker upload <solution> evaluate-data.bin:dummy-data.bin to replace the input dummy dataset (dummy-data.bin) with the evaluation data (evaluate-data.bin)
  • reprounzip docker run <solution>
  • reprounzip docker download <solution> output.bin (i.e., you must produce a file names "output.bin" to store your output knng.)
  • evaluation of "output.bin"

Important Notes: Your solution will be evaluated on evaluate-data.bin, but in order to be evaluated, your submission must meet the following requirements:

  • The program must be reproduced correctly (i.e., the process must end with the creation of the "output.bin" file without errors).
  • The program must be finished within 30 min otherwise it incurs "timeout" error (i.e., the total time limit for generating knng is 30 min).
  • The output.bin must contain N x 100 neighbors and all the neighbors should be stored in uint32_t format.

Evaluation Metrics: We will compute the resulting average recall score on >=10,000 sample groundtruth vectors. The recall of one vector will be computed as follows: $$Recall = { \text{number of true top 100 nearest neighbors} \over \text{100}} $$

Unfortunately, ReproUnzip sometimes prints useful information about the occurred errors on stdout, causing its exclusion from the submission log. In case you are stuck on a technical error preventing the successful reproduction of your submission, with no useful information appearing in the log, and not even the provided ReproUnzip commands can help you to find out the cause of the error, you can send us an email, so that we can check the content of stdout in order to detect the presence of any useful information about the error.