Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

How Berlin's City Archives Ended Up Drowning in Duplicate Scans — and What Comes Next

A slow accumulation of digitisation contracts, shifting software standards, and underfunded coordination has left Berlin's public record system clogged with redundant image files, and the reckoning is now unavoidable.

By Berlin News Desk · Published 4 July 2026, 9:26 pm

3 min read

How Berlin's City Archives Ended Up Drowning in Duplicate Scans — and What Comes Next
Photo: Photo by Nikita Pishchugin on Pexels
Wird übersetzt…

Berlin's Landesarchiv holds roughly 130 linear kilometres of documents. A growing share of that material has now been digitised twice, sometimes three times, by different contractors working under different Senate department budgets — producing thousands of duplicate image files that consume server space, complicate search results, and cost money to store. The problem, years in the making, is finally being treated as a structural failure rather than an administrative quirk.

The timing matters. The SPD-led Senate coalition has pushed digital government services as a centrepiece of its second-term agenda, and the flagship Berlin Service Portal — relaunched in updated form in early 2025 — depends partly on clean, searchable archival records feeding into planning databases, housing registries, and citizenship documentation. Duplicates don't just waste storage. They create genuine errors downstream, when automated systems flag conflicting file IDs or serve outdated scan versions to civil servants and residents requesting certified copies.

How the Duplication Grew

The roots trace back to the early 2010s, when individual Senate departments — Stadtentwicklung, Innenverwaltung, and the district administrations in Mitte and Friedrichshain-Kreuzberg among them — began commissioning their own scanning projects without a shared file-naming convention or a central deduplication registry. The Landesarchiv on Eichborndamm in Reinickendorf, which formally absorbed many of those collections, inherited the inconsistencies along with the files.

Each digitisation wave came with its own technical standard. TIFF at 300 DPI was common in the first generation. Later contractors switched to 400 DPI, then added JPEG derivatives for web access. When the Bezirksamt Tempelhof-Schöneberg migrated its civil registry scans to the Senate's central storage platform in 2022, auditors found that roughly 18 percent of documents had already been scanned under an earlier Senatsverwaltung contract. Similar audits in Pankow and Lichtenberg produced comparable figures, according to internal briefing materials discussed at a Berliner Datenschutzbeauftragter working group that year.

The Zentraler IT-Dienstleister des Landes Berlin, known as ITDZ Berlin, has been tasked since 2023 with consolidating the city's archival storage infrastructure. That work exposed the duplication problem at scale for the first time. ITDZ handles data services for around 80,000 Berlin public-sector employees, and its storage audit — completed in late 2024 — identified duplicate image replacement as a prerequisite for the broader government cloud migration scheduled for 2026 and 2027.

What the Fix Actually Requires

Deduplication is not simply a matter of deleting obvious copies. Archivists at the Landesarchiv argue — correctly, by professional standards — that two scans of the same physical document are not necessarily identical records. One may have been produced under better lighting conditions, or may carry a later quality-control certification. Choosing which file to keep, and which metadata chain to preserve, requires human review in a significant proportion of cases. That review costs time, and time costs money the archive has not historically had in abundance.

The Landesarchiv's annual budget sits below €10 million, a figure that has drawn repeated criticism from the Berliner Historische Kommission and from researchers at the Freie Universität Berlin's historical institute, who rely on the archive for access to urban planning records from the Weimar and postwar periods. A dedicated duplicate-image-replacement programme would require additional staff, purpose-built software tools, and a fixed timetable — none of which are yet formally budgeted in the 2026 Haushalt.

What the Senate has confirmed is that ITDZ Berlin will deploy an automated hash-matching tool across the consolidated storage environment before the end of this year, flagging probable duplicates for human review. The Landesarchiv is expected to receive a supplementary allocation, with a figure to be confirmed in the autumn Haushaltsberatungen. District archives in Mitte and Neukölln have already begun preliminary internal audits to reduce the volume of flagged files before the centralised sweep begins. For anyone who has submitted a document request and received contradictory scan versions — a not uncommon experience at civil registry offices along Karl-Marx-Straße or at the Bürgeramt on Müllerstraße — a cleaner system should eventually mean faster, more reliable service. The eventual timeline, however, remains tied to budget negotiations that won't conclude until October.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.