epub-utils: EPUB Inspection and Manipulation#

PyPI version Python versions License

epub-utils is a comprehensive Python library and command-line tool for working with EPUB files. It provides both a programmatic API and an intuitive CLI interface for inspecting and parsing EPUB archives.

Note

epub-utils supports EPUB 2.0.1 and EPUB 3.0+ specifications, ensuring compatibility with the vast majority of EPUB files in circulation.

Key Features#

Rich CLI Interface
  • Syntax-highlighted XML output

  • Multiple output formats (XML, raw, key-value, plain text)

  • Comprehensive file inspection capabilities

Complete EPUB Support
  • Parse container.xml and package files

  • Extract and display table of contents

  • Access manifest and spine information

  • Retrieve document content by ID

Metadata Extraction
  • Dublin Core metadata support

  • EPUB-specific metadata fields

  • Key-value output for easy parsing

Python API
  • Clean, object-oriented interface

  • Lazy loading for performance

  • Comprehensive error handling

Quick Start#

Installation#

$ pip install epub-utils

Basic CLI Usage#

Inspect an EPUB file with a simple command:

# Display metadata with beautiful syntax highlighting
$ epub-utils my-book.epub metadata

# Show table of contents structure
$ epub-utils my-book.epub toc

# Get key-value metadata for scripting
$ epub-utils my-book.epub metadata --format kv

Basic Python Usage#

from epub_utils import Document

# Load an EPUB document
doc = Document("path/to/book.epub")

# Access metadata easily
print(f"Title: {doc.package.metadata.title}")
print(f"Author: {doc.package.metadata.creator}")
print(f"Language: {doc.package.metadata.language}")

# Get table of contents
toc_xml = doc.toc.to_xml()
print(toc_xml)

Why epub-utils?#

epub-utils fills a crucial gap in the Python ecosystem for EPUB file manipulation. While there are libraries for creating EPUBs, few focus on inspection and analysis. This tool is perfect for:

Publishers and Authors

Validate EPUB structure and metadata before distribution

Digital Librarians

Batch process and analyze EPUB collections

Automation Scripts

Extract metadata for catalogs and databases

Debugging

Inspect malformed or problematic EPUB files

Learning

Understand EPUB structure and standards compliance

Documentation Contents#

Development

Community & Support#

License#

epub-utils is distributed under the Apache License 2.0.