This document includes scripts and text analysis to support the reproducibility review at the AGILE conference 2025, which is organised by the team of the Chair for Geoinformatics, TU Dresden, Germany.
Find out more online about reproducible
publications at AGILE and the review
process, and visit the Reproducible AGILE website: https://reproducible-agile.github.io/.
The code of this document is published on GitHub in the repository reproducible-agile/reviews-2025,
where you can inspect the R code in the file
agile-reproducibility-reviews.Rmd
and find instructions for
reproducing the workflow. The report
parameter private_info
can be set to yes
to show information which cannot be shared publicly, such as author
names, titles, or excerpts of not accepted submissions, and to upload
review files to private shares, which requires authentication. The end
of this document contains a hidden code snippet to only render
information accepted for publication as a GitHub page.
Retrieve all information about submissions from the EasyChair submissions system. The full submission information is not included in the public rendering of this report. Make sure that the shown columns in the submission table include the columns required in the code below.
type | n |
---|---|
Full paper | 23 |
The paper PDFs are downloaded from EasyChair directly using the links provided in the submission overview table.
The text is extracted from PDFs and it is processed to create a tidy data
structure without stop words. The stop
words include specific words, which might be included in the page
header, abbreviations, and terms particular to scientific articles, such
as figure
.
If there are issues with loading the PDF, you may try to convert the
PDF to .ps
Postscript and back, e.g., using a combination
of “Print to file > Postscript” in a PDF viewer and then
ps2pdf
on the command line.
23 papers successfully read. About 51 % of the words are considered stop words.
The following table shows how many words and non-stop words each
document has, sorted by number of non-stop words. The id
is
built from the file name plus a prefix: for full papers, it is the
left-padded submission number and the prefix fp_
;
type | words | non-stop words | % |
---|---|---|---|
Full paper | 3589 | 1734 | 48.3 |
Full paper | 7289 | 3437 | 47.2 |
Full paper | 9312 | 4079 | 43.8 |
Full paper | 6904 | 3826 | 55.4 |
Full paper | 6989 | 3058 | 43.8 |
Full paper | 7051 | 3734 | 53.0 |
Full paper | 4716 | 2510 | 53.2 |
Full paper | 9439 | 5449 | 57.7 |
Full paper | 8463 | 4148 | 49.0 |
Full paper | 7612 | 4220 | 55.4 |
Full paper | 8629 | 4087 | 47.4 |
Full paper | 4636 | 2473 | 53.3 |
Full paper | 7489 | 3724 | 49.7 |
Full paper | 7687 | 4062 | 52.8 |
Full paper | 9111 | 4527 | 49.7 |
Full paper | 6746 | 3194 | 47.3 |
Full paper | 7432 | 4099 | 55.2 |
Full paper | 6236 | 2931 | 47.0 |
Full paper | 11050 | 5790 | 52.4 |
Full paper | 9466 | 4666 | 49.3 |
Full paper | 5844 | 3101 | 53.1 |
Full paper | 6702 | 3621 | 54.0 |
Full paper | 6162 | 3347 | 54.3 |
Full papers: 23 | 168554 | 85817 | 50.9 |
According the the AGILE Reproducible Paper Guidelines, all authors must add a Data and Software Availability section to their paper. This detection naturally relies on the loaded texts with stop words.
22 papers have the section in question, that is 96 % of all submissions. Here are the statistics per submission type:
type | submissions | with DASA | % |
---|---|---|---|
Full paper | 23 | 22 | 95.7 |
Some papers might have slightly different terms because of human error or misinterpretation of the template or the guidelines.
0 papers do not have a DASA section but have possible related sections in question. These are excerpts of the similarly matched patterns, which were found anywhere in the text, not just in headlines.
id | type | excerpt |
---|---|---|
NA | NA | NA |
:– | :—- | :——- |
For the following table and figure, the word stems were extracted
based on a stemming algorithm from package quanteda
.
The word cloud is based on 156 unique words occuring each at least 100
times, all in all occuring 28663 times which comprises 33 % of non-stop
words.
place | wordstem | n | # papers | % papers |
---|---|---|---|---|
1 | data | 884 | 23 | 100 |
2 | model | 679 | 23 | 100 |
3 | urban | 586 | 20 | 87 |
4 | spatial | 571 | 23 | 100 |
5 | map | 564 | 20 | 87 |
6 | locat | 487 | 22 | 96 |
7 | transport | 469 | 11 | 48 |
8 | citi | 456 | 20 | 87 |
9 | studi | 418 | 22 | 96 |
10 | analysi | 400 | 23 | 100 |
11 | inform | 372 | 22 | 96 |
12 | time | 361 | 22 | 96 |
13 | result | 357 | 23 | 100 |
14 | public | 344 | 18 | 78 |
15 | access | 334 | 20 | 87 |
16 | geograph | 331 | 23 | 100 |
17 | network | 326 | 19 | 83 |
18 | road | 322 | 13 | 57 |
19 | featur | 321 | 16 | 70 |
20 | set | 296 | 19 | 83 |
Wordstem cloud of AGILE 2023 full paper submissions
The assignment of reviews is done via a privately shared spreadsheet,
to handle potential non-public comments. The main outcome of the reviews
is a report, which is published in individual OSF projects as
components of the OSF project for the
reproducibility reviews 2025. The report should be based on a
template from this repository in report-template
.
000
as example:
000.pdf
- the submission manuscript (may be updated
when authors upload new reports, ideally though you receive revisions
directly from the authors)000_authors.html
- an HTML file with author names and a
handy link to write a message to all authors CC’ing the reproducibility
committee chair(s)000_reviews.html
- the scientific reviews with reviewer
names redacted - useful to consider to be prepared for upcoming changes
in the submission and to check for comments on data, code, software, or
even reproducibilityReproducibility review of: <FULL PAPER TITLE HERE>
10.17605/OSF.IO/
to guess the future DOI; for
ResearchEquals, use the DOI displayed in the module draftLICENSE.md
and licensing information in the OSF project
description in that case) if confirmed by the authorcodecheck.yml
configuration filesThis document is licensed under a Creative Commons Attribution 4.0 International License. All contained code is licensed under the Apache License 2.0.
Runtime environment description:
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
## [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8
## [10] LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] tabulapdf_1.0.5-5 glue_1.8.0 rvest_1.0.4 xml2_1.3.7 httr_1.4.7
## [6] kableExtra_1.4.0 googlesheets4_1.1.1 googledrive_2.1.1 quanteda_4.2.0 here_1.0.1
## [11] wordcloud_2.6 RColorBrewer_1.1-3 tidytext_0.4.2 lubridate_1.9.4 forcats_1.0.0
## [16] dplyr_1.1.4 purrr_1.0.4 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1
## [21] ggplot2_3.5.1 tidyverse_2.0.0 stringr_1.5.1 pdftools_3.5.0
##
## loaded via a namespace (and not attached):
## [1] fastmatch_1.1-6 gtable_0.3.6 xfun_0.51 bslib_0.9.0 rJava_1.0-11
## [6] gargle_1.5.2 lattice_0.22-6 tzdb_0.4.0 vctrs_0.6.5 tools_4.4.2
## [11] generics_0.1.3 curl_6.2.1 janeaustenr_1.0.0 pkgconfig_2.0.3 tokenizers_0.3.0
## [16] Matrix_1.7-1 lifecycle_1.0.4 compiler_4.4.2 munsell_0.5.1 htmltools_0.5.8.1
## [21] SnowballC_0.7.1 sass_0.4.9 yaml_2.3.10 pillar_1.10.1 jquerylib_0.1.4
## [26] rsconnect_1.3.3 cachem_1.1.0 stopwords_2.3 tidyselect_1.2.1 digest_0.6.37
## [31] stringi_1.8.4 rprojroot_2.0.4 fastmap_1.2.0 grid_4.4.2 colorspace_2.1-1
## [36] cli_3.6.4 magrittr_2.0.3 withr_3.0.2 scales_1.3.0 timechange_0.3.0
## [41] rmarkdown_2.29 qpdf_1.3.4 cellranger_1.1.0 png_0.1-8 askpass_1.2.1
## [46] hms_1.1.3 evaluate_1.0.3 knitr_1.49 viridisLite_0.4.2 rlang_1.1.5
## [51] Rcpp_1.0.14 selectr_0.4-2 svglite_2.1.3 rstudioapi_0.17.1 jsonlite_1.9.1
## [56] R6_2.6.1 systemfonts_1.1.0 fs_1.6.5