Kostenlos abonnieren
The Daily Berlin

Berlin news, every day

News

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Archivists Are Finally Talking About the Cost

From the Landesarchiv to Tempelhof's redevelopment files, redundant digital imagery is quietly draining public storage budgets and slowing access to records that residents actually need.

By Berlin News Desk · Published 4 July 2026, 8:28 pm

3 min read

Berlin's Digital Archives Are Full of Duplicate Images — and Officials, Experts and Archivists Are Finally Talking About the Cost
Photo: Bithell, Jethro, 1878-1962 / Public domain (Wikimedia Commons)
Wird übersetzt…

Berlin's public sector is sitting on a storage problem that nobody wants to put a number on publicly — duplicate digital images embedded inside planning documents, construction permits and heritage records across dozens of city departments. Administrators and archivists began comparing notes formally in May 2026 at a working group convened by the Senatsverwaltung für Stadtentwicklung, Bauen und Wohnen. The meeting, held at the Senate building on Württembergische Straße in Wilmersdorf, brought together IT staff from six districts to discuss what insiders have been calling a quiet budget leak.

The timing is not accidental. Berlin's housing crisis has pushed planning departments to digitise backlogs at speed. The rush to scan and upload everything — from Mitte rezoning applications to Neukölln building inspection photographs — created the duplication problem in the first place. When staff scan physical files in bulk, the same architectural photograph or site survey image can appear dozens of times across separate document packages. Multiply that across four years of digitisation drives and the redundant data runs into terabytes.

What Experts Are Flagging

Archivists at the Landesarchiv Berlin, based on Eichborndamm in Reinickendorf, have been vocal within professional circles about the need for deduplication tools that can work across linked databases rather than within single file systems. The problem with municipal records, according to presentations given at the Deutscher Archivtag conference in late 2025, is that the same scan might exist simultaneously in a district-level Bezirksamt server and in the centralised Berlin Open Data portal — each copy consuming separate licensed storage and appearing separately in search results, which degrades the quality of public-facing document searches.

Digital preservation specialists have pointed to the European Commission's interoperability framework for public administrations, updated in 2024, as a benchmark Berlin has yet to fully implement. That framework recommends hash-based duplicate detection — essentially a fingerprinting method — as standard procedure before any scanned document is formally archived. Berlin's current procurement contracts with its primary document management vendor, signed in 2023, did not include that requirement.

Technologists at the CityLAB Berlin, the city's publicly funded innovation lab in the former Tempelhof Airport terminal building on Tempelhofer Damm, have been prototyping a lightweight deduplication layer that could sit between scanning stations and the central archive. CityLAB has described the project on its website as early-stage, with no confirmed rollout date. The lab's previous work on the Berliner Open Data infrastructure gives it credibility in this space, though bridging the gap between a prototype and a working municipal procurement process is a different challenge entirely.

Budget Pressure Sharpens the Debate

Berlin's 2026 budget is under pressure from multiple directions — debt service, BVG public transport expansion and an ambitious social housing programme. Storage is not a headline item, but IT administrators across the Bezirksämter have noted in internal communications reviewed by The Daily Berlin that cloud storage costs for public records rose materially between 2023 and 2025 as digitisation volumes increased. No official consolidated figure has been published for city-wide archival storage expenditure.

The SPD-led Senate coalition has committed in its current governing agreement to a fully paperless planning process by 2028. Achieving that without first cleaning up the duplicate data problem would, according to digital governance researchers at the Technische Universität Berlin in Charlottenburg, simply replicate existing inefficiencies at greater scale and cost. TU Berlin's information systems faculty has flagged this risk in published academic work, though it has not been asked by the Senate to conduct a formal audit.

What happens next depends on decisions expected before the summer recess. The Senatsverwaltung working group is due to deliver a recommendation on deduplication procurement by the end of August 2026. District IT leads from Friedrichshain-Kreuzberg and Pankow have both indicated, through published meeting minutes, that they want any solution to cover not just future scans but existing archives going back to 2020. That retroactive scope would significantly increase both the technical complexity and the contract value of whatever solution the Senate eventually approves.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Berlin

This article was produced by the The Daily Berlin editorial desk and covers news in Berlin. See our editorial standards for how we use AI.

The Daily Berlin brief

The day's Berlin news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Berlin news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Berlin and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Berlin

More in News

Enjoyed this story? Get tomorrow's briefing free.