Berlin's Senate Department for Urban Development and Housing confirmed this week that an internal audit of its digital planning archive has uncovered tens of thousands of duplicate image files — scanned documents and photographs that were uploaded multiple times across different storage systems over roughly two decades of piecemeal digitisation. The problem is not unique to one department. The Senate Chancellery at Pariser Platz and the Landesarchiv Berlin on Eichborndamm in Reinickendorf are both understood to be affected by overlapping records, created partly because different departments used incompatible scanning protocols before the city standardised its document management system in 2019.
The timing matters. Berlin is mid-way through a broader push to consolidate its public digital infrastructure under the Berlin IT service provider ITDZ Berlin, which manages cloud storage and data services for the city's roughly 130,000 public employees. Redundant files are not merely an administrative nuisance — they consume server capacity, inflate storage costs, and complicate legal obligations under Germany's Archivgesetz, the federal and state archiving law that requires authentic, singular copies of public records to be preserved and accessible.
What the Review Process Looks Like
Two options are now on the table inside the Senate administration. The first is automated deduplication software, which identifies identical or near-identical image files using hash comparison and perceptual matching algorithms. ITDZ Berlin has piloted such tools in a limited capacity since late 2024, and the results have been broadly positive for straightforward duplicates. The problem is near-duplicates — slightly differently scanned versions of the same physical document, or photographs taken seconds apart at a planning inspection. Automated tools frequently flag these incorrectly, and a wrongful deletion of an authentic archival record could violate the Archivgesetz.
The second route is a supervised manual review, department by department, using archivists from the Landesarchiv and specialist staff seconded from the Senate. This approach is slower — estimates circulating inside the administration suggest a full review could take until the end of 2027 at current staffing levels — but it carries far fewer legal risks. The Landesarchiv on Eichborndamm already handles roughly 30 linear kilometres of physical records, and its digital section has been under-resourced for years.
Costs are contested. ITDZ Berlin has publicly listed enterprise-grade deduplication licensing in the range of €80,000 to €200,000 annually depending on storage volume, figures drawn from its published service catalogue. A hybrid approach — running automated tools to flag clear duplicates while routing ambiguous cases to human reviewers — is now the option favoured by at least one internal working group, according to documents published under Berlin's transparency framework on the Berlin Open Data portal.
Politics and the Path Forward
The SPD-led Senate coalition has made digital modernisation a stated priority, but housing and transport spending — particularly the ongoing BVG investment programme to expand U-Bahn capacity on the U2 and U3 lines — has absorbed most of the discretionary budget discussion in the first half of 2026. Archive infrastructure rarely generates the same political urgency as rent caps or new school buildings in Neukölln or Lichtenberg.
That dynamic may be shifting. The Abgeordnetenhaus, Berlin's state parliament, is scheduled to hold a hearing on digital records management in September 2026, and at least three members of the relevant committee have put written questions to the Senate about the duplicate file backlog since May. The answers, when they come, will likely define the pace and funding of whatever cleanup follows.
For Berliners dealing with planning enquiries — residents in Friedrichshain waiting on building permit responses, or small businesses in Mitte trying to retrieve historical property documents — the practical consequence is straightforward: expect delays while the audit continues. Anyone with time-sensitive requests should contact the relevant borough office directly rather than relying on the centralised digital portal, which remains partially unreliable for records dated before 2005. A Senate spokesperson's office said responses to standard archive requests are currently running two to three weeks beyond normal processing times.