Call for Artifacts

Deadline: Nov 25, 2024 AOE. The submission website is open.

Authors of accepted PPoPP 2025 papers/posters are invited to formally submit their supporting materials to the Artifact Evaluation (AE) process. The Artifact Evaluation Committee attempts to reproduce experiments (in broad strokes) and assess if submitted artifacts support the claims made in the paper/poster. The submission is voluntary and does not influence the final decision regarding paper/poster acceptance.

We invite every author of an accepted PPoPP paper/poster to consider submitting an artifact. It is good for the community as a whole. At PPoPP, we follow ACM’s artifact reviewing and badging policy. ACM describes a research artifact as follows:

“By “artifact” we mean a digital object that was either created by the authors to be used as part of the study or generated by the experiment itself. For example, artifacts can be software systems, scripts used to run experiments, input datasets, raw data collected in the experiment, or scripts used to analyze results.”

The submission of an artifact is not the same as making it public. AEC members will be instructed that they may not publicize any part of your artifact during or after completing evaluation, nor retain any part of it after evaluation. Thus, you are free to include models, data files, proprietary binaries, and similar items in your artifact.

Submission Site

The submission site is located at https://ppopp25ae.hotcrp.com/.

Evaluation Process

At PPoPP the artifact evaluation committee awards for each successfully evaluated paper two badges: either ‘Artifacts Evaluated — Functional” (lighter red) or ‘Artifacts Evaluated — Reusable” darker red badges as well as the ‘Results Reproduced’ (darker blue) badge. In this artifact evaluation process, we do not award the lighter blue ‘Results Replicated’ badge. The green ‘Artifact Available’ badge, however, does not require a formal audit and is awarded directly by the publisher if the authors provide a link to the deposited artifact. Please refer to the table below for detailed information on the badges.

Artifact evaluation is single-blind. Please take precautions (e.g. turning off analytics, logging) to help prevent accidentally learning the identities of reviewers. Each submitted artifact is evaluated by at least two members of the artifact evaluation committee.

During the process, authors and evaluators are allowed to anonymously communicate with each other to overcome technical difficulties. Ideally, we hope to see all submitted artifacts to successfully pass the artifact evaluation.

Note that the variation of empirical and numerical results is tolerated. In fact, it is often unavoidable in computer systems research - see “how to report and compare empirical results?” in AE FAQ on ctuning.org!

The evaluators are asked to evaluate the artifact based on the following criteria, that are defined by ACM. ACM recommends awarding three different types of badges to communicate how the artifact has been evaluated. A single paper can receive up to three badges — one badge of each type. Below gives a brief description of each badge, please refer to the ACM website for more information.

alt text The green ‘Artifacts Available’ badge indicates that an artifact is publicly accessible in an archival repository. For this badge to be awarded the paper does not have to be independently evaluated. ACM requires that a qualified archival repository is used, for example Zenodo, figshare, Dryad. Personal webpages, GitHub repositories or alike are not sufficient as it can be changed after the submission deadline!
alt text alt text The red ‘Artifacts Evaluated’ badges indicate that a research artifact has successfully completed an independent audit. A reviewer has verified that the artifact is documented, complete, consistent, exercisable, and includes appropriate evidence of verification and validation. Two levels are distinguished:
The lighter red ‘Artifacts Evaluated — Functional’ badge indicates a basic level of functionality.
The darker red ‘Artifacts Evaluated — Reusable’ badge indicates a higher quality artifact which significantly exceeds minimal functionality so that reuse and repurposing is facilitated.
Artifacts need not be made publicly available to be considered for one of these badges. However, they do need to be made available to reviewers.
alt text alt text The blue ‘Results Validated’ badges indicate that the main results of the paper have been successfully obtained by an independent reviewer. Two levels are distinguished:
The darker blue ‘Results Reproduced’ badge indicates that the main results of the paper have been successfully obtained using the provided artifact.
The lighter blue ‘Results Replicated’ badge indicates that the main results of the paper have been independently obtained without using the author-provided research artifact.
Artifacts need not be made publicly available to be considered for one of these badges. However, they do need to be made available to reviewers.

At PPoPP the artifact evaluation committee awards for each successfully evaluated paper one of the two red Artifacts Evaluated badges as well as the darker blue Results Reproduced badge. We do not award the lighter blue Results Replicated badge in this artifact evaluation process. The green Artifact Available badge does not require the formal audit and, therefore, is awarded directly by the publisher — if the authors provide a link to the deposited artifact.

Note that the variation of empirical and numerical results is tolerated. In fact, it is often unavoidable in computer systems research - see “how to report and compare empirical results?” in AE FAQ on ctuning.org!

Packaging and Instructions

Your submission should consist of three pieces:

  1. The submission version of your paper/poster.
  2. A README file (PDF or plaintext format) that explains your artifact (details below).
  3. The artifact itself, packaged as a single archive file. Artifacts less than 600MB can be directly uploaded to the hotCRP submission site; for archives larger than 600MB, please provide a URL pointing to the artifact; the URL must protect the anonymity of the reviewers. Please use a widely available compressed archive format such as ZIP (.zip), tar and gzip (.tgz), or tar and bzip2 (.tbz2). Ensure the file has the suffix indicating its format. Those seeking the “Available” badge must additionally follow the appropriate instructions recommended by ACM on uploading the archive to a publicly available, immutable location to receive the badge.

The README file should consist of two parts:

  1. a Getting Started Guide and
  2. Step-by-Step Instructions for how you propose to evaluate your artifact (with appropriate connections to the relevant sections of your paper);

The Getting Started Guide should contain setup instructions (including, for example, a pointer to the VM player software, its version, passwords if needed, etc.) and basic testing of your artifact that you expect a reviewer to be able to complete in 30 minutes. Reviewers will follow all the steps in the guide during an initial kick-the-tires phase. The Getting Started Guide should be as simple as possible, and yet it should stress the key elements of your artifact. Anyone who has followed the Getting Started Guide should have no technical difficulties with the rest of your artifact. In this step, you may want to include a single high-level “runme.sh” script that automatically compiles your artifact, runs it (printing some interesting events to the console), collects data (e.g., performance data), and produces files such as graphs or charts similar to the ones used in your paper.

The Step-by-Step Instructions explain how to reproduce any experiments or other activities that support the conclusions in your paper. Write this for readers who have a deep interest in your work and are studying it to improve it or compare against it. If your artifact runs for more than a few minutes, point this out and explain how to run it on smaller inputs.

Where appropriate, include descriptions of and links to files (included in the archive) that represent expected outputs (e.g., the speedup comparison chart expected to be generated by your tool on the given inputs); if there are warnings that are safe to be ignored, explain which ones they are.

The artifact’s documentation should include the following:

  • A list of claims from the paper supported by the artifact, and how/why.
  • A list of claims from the paper not supported by the artifact, and how/why. Example: Performance claims cannot be reproduced in VM, authors are not allowed to redistribute specific benchmarks, etc. Artifact reviewers can then center their reviews / evaluation around these specific claims.

If you are seeking a “reusable” badge, your documentation should include which aspects of the artifact you suggest the reviewer exercise in a different setting. For example, you may want to point out which script to modify so that the reviewer may be able to run your tool on a benchmark not used in the paper. You may want the reviewer to suggest where to edit a script to change the number of CPU cores used for evaluation.

After preparing your artifact, download and test it on at least one fresh machine where you did not prepare the artifact; this will help you fix missing dependencies, if any.

We strongly encourage you to use a container (e.g., https://www.docker.com/) which provides a way to make an easily reproducible environment. It also helps the AEC have confidence that errors or other problems cannot cause harm to their machines.

Submission Guidelines

1. Carefully think which badges you want.

In your hotCRP submission, be upfront about which badge(s) you are seeking.

  1. If making your code public is all you want to do, seek only the ‘Available’ (green) badge. The reviewers will not exercise the artifact for its functionality or validate the claims.
  2. If you do not plan to make the artifact publicly available, do not seek the ‘Available’ (green) badge. However, you may still pursue one or both of the other badges.
  3. If you only plan to reproduce the claims without making your artifact Documented, Consistent, Complete, and exercisable, seek for the “Results Replicated” (darker blue) badge rather than the “Functional/Reusable” (red) badge.

2. Minimize the artifact setup overhead

A well-packaged artifact is easily usable by the reviewers, saving them time and frustration, and more clearly conveying the value of your work during evaluation. A great way to package an artifact is as a Docker image or in a virtual machine that runs “out of the box” with very little system-specific configuration. Using a virtual machine provides a way to make an easily reproducible environment — it is less susceptible to bit rot. It also helps the AEC have confidence that errors or other problems cannot cause harm to their machines.
Giving AE reviewers remote access to your machines with preinstalled (proprietary) software is also possible.

3. Carefully think your artifact working on a reviewer’s machine

The reviewers will not have access to any special hardware or software outside of their own research needs provided by their university or research team. There are more tips for preparing a submission available on the ctuning website.
If you have an unusual experimental setup that requires specific hardware (i.e., custom hardware, oscilloscopes for measurements …) or proprietary software please contact the artifact evaluation chairs before the submission.

Continuous Discussion with Reviewers

Throughout the review period, reviews will be submitted to HotCRP and will be (approximately) continuously visible to authors. AEC reviewers will be able to continuously interact (anonymously) with authors for clarifications, system-specific patches, and other logistics to help ensure that the artifact can be evaluated. The goal of continuous interaction is to prevent rejecting artifacts for “wrong library version” types of problems.

Artifact Evaluation Committee

Other than the AE chairs, the AEC members are senior graduate students, postdocs, or recent PhD graduates, identified with the help of the PPoPP25 PC and recent artifact evaluation committees. Please check SIGPLAN’s Empirical Evaluation Guidelines for some methodologies to consider during evaluation.
Throughout the review period, reviews will be submitted to HotCRP and will be (approximately) continuously visible to authors. During the evaluation process, authors and AEC are allowed to anonymously communicate through the HotCRP system to overcome technical difficulties. Ideally, we hope to see all submitted artifacts to successfully pass the artifact evaluation.

Contact

For questions, please contact AE co-chairs, Keren Zhou (kzhou6@gmu.edu) or Jiajia Li (jiajia.li@ncsu.edu).