Unzipping, editing and zipping ODT documents in python

This is a python script created with a single purpose: to test unzipping of OpenOffice.org (LibreOffice) word processor .odt document file, searching in its contents for a certain text and replacing it with a substitute, and, eventually, zipping it all back together to form a new .odt document.

If you copy and past this text, remember, that HTML formatting may spoil the code, so please check it for any introduced mistakes. Especially since this is python with its indentation issues.


# Just a test script
# Demonstrates unzipping, editing and zipping of ODT documents in python
# Source ODT file "in.odt" shall exist in "/tmp"
# If ODT file contains string token, it will be replaced with string replacement

import os
import zipfile
import fileinput
replacement="TEST SUCCESSFUL"
# Unzip ODT
print " -- Extracting ---------------------"
print "%s -> %s" % (zipSourceFileURL, tmpDirURL)
zipdata = zipfile.ZipFile(zipSourceFileURL)
# Find and replace tokens
print " -- Replacing -------------"
print xmlFileURL
for line in fileinput.input(xmlFileURL, inplace=1):
    print line.replace(token,replacement)
# Zip contents of the temporary directory to ODT
# Use file list from the original archive
# This preserves the file structure in the new Zip file
# The most important is that the "mimetype" is the first file in archive
print " -- Compressing --------------------"
print "%s -> %s" % (tmpDirURL , zipOutFileURL)
with zipfile.ZipFile(zipOutFileURL, 'w') as outzip:
    zipinfos = zipdata.infolist()
    for zipinfo in zipinfos:
        fileName=zipinfo.filename # The name and path as stored in the archive
        fileURL=tmpDirURL+"/"+fileName # The actual name and path

