Skip to content

Namespace prefix a on extLst is not defined while manipulating Word files #1714

@tonyqus

Description

@tonyqus

NPOI Version

  • 2.7.5
  • C# 12
  • Windows 11

File Type

  • Docx

Problem description

I use NPOI to manipulate word documents. In particular, I replace paragraphs text, delete and add runs to paragraphs. While for most documents I have no issues, occasionally the word documents produced by my program are corrupted. If renaming the .docx file in .zip and opening document.zip/word/document.xml with a browser I see the following error:

error on line [...] at column [...]: Namespace prefix a on extLst is not defined

As specified above, it looks like the chunks of xml codes that create problems are those between <a:extLst> and </a:extLst>, like

<a:extLst><a:ext uri="{53640926-AAD7-44D8-BBD7-CCE9431645EC}"><a14:shadowObscured xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main" /></a:ext></a:extLst>

If I delete all such occurrencies, the word document can be opened again.

Workaround

If I open document.xml with a text editor and I remove all occurrencies of <a:extLst> and </a:extLst> and all the text within such delimiters, the problem is fixed.

More informations

Gemini PRO produced the following informations:
This is a very specific piece of metadata introduced in newer versions of Microsoft Office (Word 2016 and later). It assigns a unique, persistent ID (the GUID inside id="{...}") to a graphical object, such as an image, shape, or text box. It helps modern versions of Word track that specific object during complex operations, such as co-authoring (when multiple people edit a doc at once) or track changes. It allows Word to say, "This image moved from page 1 to page 2" rather than "The image on page 1 was deleted and a new one appeared on page 2.". This data is purely metadata for editing convenience in newer Word versions. It does not contain the image data itself, nor the positioning or formatting. If you delete these tags, Word will simply generate new IDs the next time it saves the file, or just ignore that they are missing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugfile_errorfile format generation/writing issuexwpf

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions