Introduction

This document includes scripts and text analysis to support the reproducibility review at the AGILE conference 2025, which is organised by the team of the Chair for Geoinformatics, TU Dresden, Germany.

Find out more online about reproducible publications at AGILE and the review process, and visit the Reproducible AGILE website: https://reproducible-agile.github.io/. The code of this document is published on GitHub in the repository reproducible-agile/reviews-2025, where you can inspect the R code in the file agile-reproducibility-reviews.Rmd and find instructions for reproducing the workflow. The report parameter private_info can be set to yes to show information which cannot be shared publicly, such as author names, titles, or excerpts of not accepted submissions, and to upload review files to private shares, which requires authentication. The end of this document contains a hidden code snippet to only render information accepted for publication as a GitHub page.

Submitted papers

Submission metadata

Retrieve all information about submissions from the EasyChair submissions system. The full submission information is not included in the public rendering of this report. Make sure that the shown columns in the submission table include the columns required in the code below.

type n
Full paper 23

Full texts

The paper PDFs are downloaded from EasyChair directly using the links provided in the submission overview table.

The text is extracted from PDFs and it is processed to create a tidy data structure without stop words. The stop words include specific words, which might be included in the page header, abbreviations, and terms particular to scientific articles, such as figure.

If there are issues with loading the PDF, you may try to convert the PDF to .ps Postscript and back, e.g., using a combination of “Print to file > Postscript” in a PDF viewer and then ps2pdf on the command line.

23 papers successfully read. About 51 % of the words are considered stop words.

The following table shows how many words and non-stop words each document has, sorted by number of non-stop words. The id is built from the file name plus a prefix: for full papers, it is the left-padded submission number and the prefix fp_;

type words non-stop words %
Full paper 3589 1734 48.3
Full paper 7289 3437 47.2
Full paper 9312 4079 43.8
Full paper 6904 3826 55.4
Full paper 6989 3058 43.8
Full paper 7051 3734 53.0
Full paper 4716 2510 53.2
Full paper 9439 5449 57.7
Full paper 8463 4148 49.0
Full paper 7612 4220 55.4
Full paper 8629 4087 47.4
Full paper 4636 2473 53.3
Full paper 7489 3724 49.7
Full paper 7687 4062 52.8
Full paper 9111 4527 49.7
Full paper 6746 3194 47.3
Full paper 7432 4099 55.2
Full paper 6236 2931 47.0
Full paper 11050 5790 52.4
Full paper 9466 4666 49.3
Full paper 5844 3101 53.1
Full paper 6702 3621 54.0
Full paper 6162 3347 54.3
Full papers: 23 168554 85817 50.9

Which papers include a “Data and Software Availability” section?

According the the AGILE Reproducible Paper Guidelines, all authors must add a Data and Software Availability section to their paper. This detection naturally relies on the loaded texts with stop words.

22 papers have the section in question, that is 96 % of all submissions. Here are the statistics per submission type:

type submissions with DASA %
Full paper 23 22 95.7

Some papers might have slightly different terms because of human error or misinterpretation of the template or the guidelines.

0 papers do not have a DASA section but have possible related sections in question. These are excerpts of the similarly matched patterns, which were found anywhere in the text, not just in headlines.

id type excerpt
NA NA NA
:– :—- :——-

Wordstem analysis

For the following table and figure, the word stems were extracted based on a stemming algorithm from package quanteda. The word cloud is based on 156 unique words occuring each at least 100 times, all in all occuring 28663 times which comprises 33 % of non-stop words.

place wordstem n # papers % papers
1 data 884 23 100
2 model 679 23 100
3 urban 586 20 87
4 spatial 571 23 100
5 map 564 20 87
6 locat 487 22 96
7 transport 469 11 48
8 citi 456 20 87
9 studi 418 22 96
10 analysi 400 23 100
11 inform 372 22 96
12 time 361 22 96
13 result 357 23 100
14 public 344 18 78
15 access 334 20 87
16 geograph 331 23 100
17 network 326 19 83
18 road 322 13 57
19 featur 321 16 70
20 set 296 19 83
Wordstem cloud of AGILE 2023 full paper submissions

Wordstem cloud of AGILE 2023 full paper submissions

Reproducibility reviews

About

The assignment of reviews is done via a privately shared spreadsheet, to handle potential non-public comments. The main outcome of the reviews is a report, which is published in individual OSF projects as components of the OSF project for the reproducibility reviews 2025. The report should be based on a template from this repository in report-template.

Reproducibility reviewer instructions

  1. Familiarise yourself with the AGILE Reproducibility Review Process; refresh your memory of the Reproducibility Reviewer Guidelines in the Reproducible Paper Guidelines the following steps are just a tl;dr version
  2. Take a look at the review report templates in RMarkdown and Office (.docx & .odt) - even if you’re not using it, it gives you guidance on structure and content; you are welcome to write your report with any tool that can produce a PDF
  3. Go to the shared spreadsheet reproducibility reviewers and find your assignments
  4. Access your reproducibility package from the shared cloud folder, which includes the following files named using the submission identifier, here using 000 as example:
    • 000.pdf - the submission manuscript (may be updated when authors upload new reports, ideally though you receive revisions directly from the authors)
    • 000_authors.html - an HTML file with author names and a handy link to write a message to all authors CC’ing the reproducibility committee chair(s)
    • 000_reviews.html - the scientific reviews with reviewer names redacted - useful to consider to be prepared for upcoming changes in the submission and to check for comments on data, code, software, or even reproducibility
  5. Contact the author(s), e.g., using one of the template below; in the first email it makes sense to use all authors’ contact emails that are mentioned in the paper, even if there is one “corresponding author” listed, because we do not have the time to possibly wait for someone who is not available anymore, and it also helps with getting attention; in future communications, you can reply to the person who got back to you and kindly ask them to keep in the loop (or in CC) who needs to be kept there.
  6. Conduct your reproducibility review and write the report
    • Don’t forget to take a look at the scientific reviews for comments on reproducibility; do not worry about the science or read the full paper, unless it really interests you
    • Check the authors and their affiliations of the submission - is there a relation (e.g., former colleague, current supervisor) that may be seen as inappropriate for you as a reproducibility reviewer? Is there a conflict of interest? If so, please contact the reproducibility chair, and ask via email to the committee members if another reproducibility reviewer would be available to switch assignments
    • If code is available on GitHub/Lab, please fork the project into the Reproducible AGILE organisation respectively the GitLab subgroup “reviews”; ask Daniel to get the permissions for the organisations
    • If need be, limit the review scope, e.g. reproduce only a specific figure; the reproducibility review should not take you longer (not counting computation time) than a scientific review, and even computation times should not expand longer than a working day
  7. Send the report to the original authors of the paper and add the reproducibility chair in CC, see template below;
  8. Add a new component to the OSF project for 2025 reproducibility reviews OR create a new module on ResearchEquals
    • On OSF
      • Use the European storage location, “Frankfurt”
      • Name the component Reproducibility review of: <FULL PAPER TITLE HERE>
      • Use license “CC-BY 4.0 (Free)”
      • Keep the project private until the publication of the paper (we don’t want to announce anything that is not our place to announce)
      • Add all contributors to the review to the project (do not add all members of the committee as contributors)
      • In the project configuration:
        • disable the “Wiki”, unless you add content to it
        • set the category of repository to “Other”
    • On ResearchEquals
      • Use the type “Reproducibility Report”
      • Keep the project as a draft until the publication of the paper (we don’t want to announce anything that is not our place to announce)
      • Add the reproducibility report as the main file after adding the paper’s final citation
      • Add any additional files as “Supporting files” to the module
      • Add datasets with a DOI to the “Reference list”
      • Add the final paper to the “Reference list” using the DOI
      • Add all contributors to the review to the module as authors (need to have an account on ResearchEquals)
    • On both
      • Copy the summary from the report to the description field of the project
      • Add link to the project/record in the master spreadsheet and note the final expected DOI^
      • Wait for final paper citation from publisher and add it to the report
      • Add the to be expected DOI to your report and to the coordination spreadsheet; for OSF, append the project ID of the OSF project in capitals to 10.17605/OSF.IO/ to guess the future DOI; for ResearchEquals, use the DOI displayed in the module draft
  9. After papers are published:
    • Upload the report as a single PDF document and include in it the final citation for your report and the full reference to the (to be) published paper provided by the publisher; add supplemental material created by you
    • Upload, if suitable/applicable, also the original material (add LICENSE.md and licensing information in the OSF project description in that case) if confirmed by the author
    • Double check that only the contributing people are listed as bibliographic contributors on the project (committee chairs are welcome to be non-bibliographic administrators for organisational purposes)
    • Publish the component/module
    • On OSF
      • Create a DOI (double check if it is correct in the report)
      • Wait for a hint from the reproducibility editor to create an “Open-ended” registration to snapshot the status of the OSF component/project (your project likely still needs the CODECHECK configuration file, see below in the chair instructions)
    • On ResearchEquals
      • Make a note that you revisit the module after the publication of the article so you can add the published paper to the module’s “Reference list”

Reproducibility chair instructions

  1. Create a repository like this one based on the previous year
  2. Screen the accepted papers
  3. Distribute the reproductions amongst the committee members
  4. Support during reproductions, give feedback, advise on issues, …
  5. After reports publication
    • Add reports on ResearchEquals to the collections Reproducible AGILE and CODECHECK
    • Coordinate on the creation of codecheck.yml configuration files
    • Add the reports to the CODECHECK register
    • After adding the reports to the CODECHECK register successfully, send a note to the reproducibility reviewers that they can create the Registration
    • “Archive” the projects on GitHub so that they become read-only (instructions GitHub, instructions for GitLab)
    • Double check all OSF projects have a DOI and a registration

Email templates for contacting authors

Reminder DASA by reproducibility chair

Dear <AUTHORS>,

I'm contacting you as the corresponding author of the paper "<TITLE>" submitted to AGILE 2025.

In my screening of accepted papers I saw that your submission does not include a Data and Software Availability ("DASA") section. Please note that a DASA section with precisely that name is mandatory. Furthermore a successful reproduction of your workflow would be an advertisement for your paper.

Please provide the DASA section by the end of the week so we can start the reproducibility review.

Regards,
Carlos, Frank & Daniel

AGILE Reproducibility Committee Co-chairs 2025

Reminder DASA

Dear <AUTHORS>,

I'm contacting you as the corresponding author of the paper "<TITLE>" submitted to AGILE 2025.
I'm the reproducibility reviewer your paper has been assigned to.

The scientific reviewers have noted that your paper does not include a Data and Software Availability ("DASA") section.
Please note that a DASA section is mandatory and successful reproduction of your workflow would be an advertisement for your paper.

Please provide the DASA section by <DEADLINE> so we can start the reproducibility review.

Regards,
<NAME>

AGILE Reproducibility Committee 2025

Reminder DASA + synthetic data for proprietary data

Dear <AUTHORS>,

I'm contacting you as the corresponding author of the paper "<TITLE>" submitted to AGILE 2025.
I'm the reproducibility reviewer your paper has been assigned to.

The scientific reviewers <SELECT: have, have not> note that your paper does not include a dedicated Data and Software Availability ("DASA") section. This section should provide a concise statement if and where data and software is available, or why it is not public. Please note that a DASA section is mandatory, even if data or code is not available.
Refer to the AGILE Reproducible Paper Guidelines (https://osf.io/cb7z8/) for detailed information and possible DASA section statements. Please don't hesitate to get in touch with me if you have any questions!

In your manuscript you state that both code and data cannot be shared due to licensing issues. Is it possible for you to provide a synthetic dataset or subset and the code in order for us to reproduce your methodology?

Kind regards,
<NAME>

AGILE Reproducibility Committee 2025

Initial contact

Dear <authors>,

congratulations on the acceptance of your submission <TITLE> as an AGILE 2025 full paper.

I am assigned as reproducibility reviewer to your paper and would kindly ask you to share a non-anonymised version of your manuscript, or simply share the redacted URL from your manuscript. It would be great to get this as soon as possible to begin the evaluation of the reproducibility.

The reproducibility review is an optional but highly recommended review step before publication, because in case of a successful reproduction of your work, the paper receives an AGILE reproducibility badge (added to the actual publication on the publisher’s site) and chances are increased that readers will engage (and cite) your work. All reproducibility reviews have to happen in a narrow time window due to the conference proceedings having to be ready for the conference. For more information, I refer you to the reproducibility guidelines of the conference.
If you have any questions regarding AGILE's reproducibility review, please do not hesitate to ask! 

Kind regards,
<NAME>

AGILE Reproducibility Committee 2025

Share report draft

Dear AUTHORS,

Congratulations on the acceptance of your submission "TITLE" as a full paper at the AGILE conference 2025.

As part of the Reproducible AGILE initiative (https://reproducible-agile.github.io/) I attempted to reproduce the results from your paper. Attached to this email you find my report on your results. I welcome your feedback before I publish the report.
The reproducibility report will be published soon after the paper is published by Copernicus, so we can ensure proper linking of your work in the report (using a DOI) and vice versa.

[OPTIONAL:] Alongside the report I would like to publish an archive of the used data and script files, and the output files generated by myself. Note these would be published under a CC-BY license on OSF, though the original source and license are noted in the report.

Please don't hesitate to get in touch with me and Daniel Nüst (CC'ed), AGILE conference's Reproducibility Chair, if you have any questions. Please also include your coauthors in any further communication as you see fit.

Best regards,
<NAME>

AGILE Reproducibility Committee 2025

Report published

Dear <AUTHORS>,

Thank you for your participation in a real open science endeavour!

The reproducibility review report on your paper is now published at DOI URL HERE.

Please don't hesitate to get in touch with Daniel Nüst (CC'ed), AGILE conference's Reproducibility Chair, if you have any questions.

Best regards,
<NAME>

AGILE Reproducibility Committee 2025

Colophon

This document is licensed under a Creative Commons Attribution 4.0 International License. All contained code is licensed under the Apache License 2.0.

Runtime environment description:

## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8          LC_NUMERIC=C                  LC_TIME=en_US.UTF-8          
##  [4] LC_COLLATE=en_US.UTF-8        LC_MONETARY=en_US.UTF-8       LC_MESSAGES=en_US.UTF-8      
##  [7] LC_PAPER=en_US.UTF-8          LC_NAME=en_US.UTF-8           LC_ADDRESS=en_US.UTF-8       
## [10] LC_TELEPHONE=en_US.UTF-8      LC_MEASUREMENT=en_US.UTF-8    LC_IDENTIFICATION=en_US.UTF-8
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] tabulapdf_1.0.5-5   glue_1.8.0          rvest_1.0.4         xml2_1.3.7          httr_1.4.7         
##  [6] kableExtra_1.4.0    googlesheets4_1.1.1 googledrive_2.1.1   quanteda_4.2.0      here_1.0.1         
## [11] wordcloud_2.6       RColorBrewer_1.1-3  tidytext_0.4.2      lubridate_1.9.4     forcats_1.0.0      
## [16] dplyr_1.1.4         purrr_1.0.4         readr_2.1.5         tidyr_1.3.1         tibble_3.2.1       
## [21] ggplot2_3.5.1       tidyverse_2.0.0     stringr_1.5.1       pdftools_3.5.0     
## 
## loaded via a namespace (and not attached):
##  [1] fastmatch_1.1-6   gtable_0.3.6      xfun_0.51         bslib_0.9.0       rJava_1.0-11     
##  [6] gargle_1.5.2      lattice_0.22-6    tzdb_0.4.0        vctrs_0.6.5       tools_4.4.2      
## [11] generics_0.1.3    curl_6.2.1        janeaustenr_1.0.0 pkgconfig_2.0.3   tokenizers_0.3.0 
## [16] Matrix_1.7-1      lifecycle_1.0.4   compiler_4.4.2    munsell_0.5.1     htmltools_0.5.8.1
## [21] SnowballC_0.7.1   sass_0.4.9        yaml_2.3.10       pillar_1.10.1     jquerylib_0.1.4  
## [26] rsconnect_1.3.3   cachem_1.1.0      stopwords_2.3     tidyselect_1.2.1  digest_0.6.37    
## [31] stringi_1.8.4     rprojroot_2.0.4   fastmap_1.2.0     grid_4.4.2        colorspace_2.1-1 
## [36] cli_3.6.4         magrittr_2.0.3    withr_3.0.2       scales_1.3.0      timechange_0.3.0 
## [41] rmarkdown_2.29    qpdf_1.3.4        cellranger_1.1.0  png_0.1-8         askpass_1.2.1    
## [46] hms_1.1.3         evaluate_1.0.3    knitr_1.49        viridisLite_0.4.2 rlang_1.1.5      
## [51] Rcpp_1.0.14       selectr_0.4-2     svglite_2.1.3     rstudioapi_0.17.1 jsonlite_1.9.1   
## [56] R6_2.6.1          systemfonts_1.1.0 fs_1.6.5