diff --git a/formatscaper/README.md b/formatscaper/README.md
index d5843151f93923a8ab6f369e6dcd5565fe4c0df0..2428cb3185502ef320dc5bd8d0459068327a7ebf 100644
--- a/formatscaper/README.md
+++ b/formatscaper/README.md
@@ -181,6 +181,34 @@ The results file (e.g. `results.yml`) contains information about each investigat
 Note that the contents of the ZIP archive are inspected as well, with `#` as the delimiter between the archive's filename and the contained file's name.
 
 
+## Generating an input file from Invenio
+
+The required information is relatively straight-forward to generate using `invenio shell`:
+```python
+import yaml
+from invenio_rdm_records.proxies import current_rdm_records_service as svc
+
+# get all (published) records in the system
+rc = svc.record_cls
+recs = [rc(rm.data, model=rm) for rm in rc.model_cls.query.all()]
+
+# get the expected structure from the records
+record_files = [
+    {"record": r["id"], "filename": fn, "uri": entry.file.file.uri}
+    for r in recs
+    for fn, entry in r.files.entries.items()
+    if r.files.entries
+]
+
+# serialize the information as YAML file
+with open("record-files.yml", "w") as f:
+    yaml.dump(record_files, f)
+```
+
+The above script simply lists all files associated with any published record, but does not consider any unpublished drafts.
+Such changes are very straight-forward to implement though.
+
+
 ## Filtering results
 
 To filter results in the shell, you can use the command [`yq`](https://github.com/mikefarah/yq) (which is a [`jq`](https://github.com/jqlang/jq) wrapper for YAML documents).