CLI Reference#

This reference documents all available command-line options and commands for epub-utils.

Synopsis#

epub-utils [GLOBAL_OPTIONS] EPUB_FILE COMMAND [COMMAND_OPTIONS]

Global Options#

-h, --help: Show help message and exit
-v, --version: Show program version and exit
-pp, --pretty-print: Pretty-print XML output with proper indentation (applies to xml and raw formats only)

Commands#

All commands operate on an EPUB file and support the --format and --pretty-print options unless otherwise noted.

container#

Display the container.xml file contents.

Syntax:

epub-utils EPUB_FILE container [--format FORMAT] [--pretty-print]

Description: The container command shows the contents of META-INF/container.xml, which defines the location of the main package file within the EPUB.

Supported formats: xml (default), raw

Examples:

# Show container with syntax highlighting
epub-utils book.epub container

# Show raw container XML
epub-utils book.epub container --format raw

# Show container with pretty formatting
epub-utils book.epub container --pretty-print

# Combine both options
epub-utils book.epub container --format raw --pretty-print
epub-utils book.epub container --format raw

Sample output:

<?xml version="1.0" encoding="UTF-8"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
  <rootfiles>
    <rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml"/>
  </rootfiles>
</container>

package#

Display the main package (OPF) file contents.

Syntax:

epub-utils EPUB_FILE package [--format FORMAT] [--pretty-print]

Description: The package command shows the complete OPF (Open Packaging Format) file, which contains metadata, manifest, and spine information.

Supported formats: xml (default), raw

Examples:

# Show package with syntax highlighting
epub-utils book.epub package

# Show raw package XML for processing
epub-utils book.epub package --format raw | xmllint --format -

# Show package with pretty formatting
epub-utils book.epub package --pretty-print

toc#

Display the table of contents file.

Syntax:

epub-utils EPUB_FILE toc [--format FORMAT] [--pretty-print] [--ncx | --nav]

Description: Shows the table of contents, which can be either an NCX file (EPUB 2.x) or a Navigation Document (EPUB 3.x). By default, automatically detects and uses the appropriate format for the EPUB version.

Options:

--ncx: Force retrieval of NCX file (EPUB 2 navigation control file). For EPUB 2, this is the same as the default behavior. For EPUB 3, this specifically accesses the NCX file if present for backward compatibility.
--nav: Force retrieval of Navigation Document (EPUB 3 navigation file). Only works with EPUB 3 documents that have a Navigation Document.

Note: The --ncx and --nav flags are mutually exclusive.

Supported formats: xml (default), raw

Examples:

# Show TOC with highlighting (auto-detect format)
epub-utils book.epub toc

# Extract navigation structure
epub-utils book.epub toc --format raw

# Show TOC with pretty formatting
epub-utils book.epub toc --pretty-print

# Force NCX format (EPUB 2 style)
epub-utils book.epub toc --ncx

# Force Navigation Document (EPUB 3 style)
epub-utils book.epub toc --nav

metadata#

Display metadata information from the package file.

Syntax:

epub-utils EPUB_FILE metadata [--format FORMAT] [--pretty-print]

Description: Extracts and displays Dublin Core and EPUB-specific metadata from the package file.

Supported formats: xml (default), raw, kv

Examples:

# Show formatted metadata
epub-utils book.epub metadata

# Get key-value pairs for scripting
epub-utils book.epub metadata --format kv

# Raw metadata XML
epub-utils book.epub metadata --format raw

# Show metadata with pretty formatting
epub-utils book.epub metadata --pretty-print

Key-value output format:

title: The Great Gatsby
creator: F. Scott Fitzgerald
language: en
identifier: urn:uuid:12345678-1234-1234-1234-123456789abc
publisher: Scribner
date: 2021-01-01
subject: Fiction, Classic Literature

manifest#

Display the manifest section from the package file.

Syntax:

epub-utils EPUB_FILE manifest [--format FORMAT] [--pretty-print]

Description: Shows the manifest, which lists all files included in the EPUB package with their IDs, file paths, and media types.

Supported formats: xml (default), raw

Examples:

# Show manifest with highlighting
epub-utils book.epub manifest

# Find all CSS files
epub-utils book.epub manifest --format raw | grep 'media-type="text/css"'

# Show manifest with pretty formatting
epub-utils book.epub manifest --pretty-print
epub-utils book.epub manifest --format raw | grep 'media-type="text/css"'

# Count content files
epub-utils book.epub manifest --format raw | grep -c 'application/xhtml+xml'

spine#

Display the spine section from the package file.

Syntax:

epub-utils EPUB_FILE spine [--format FORMAT] [--pretty-print]

Description: Shows the spine, which defines the default reading order of the book’s content.

Supported formats: xml (default), raw

Examples:

# Show spine with highlighting
epub-utils book.epub spine

# Extract reading order
epub-utils book.epub spine --format raw

# Show spine with pretty formatting
epub-utils book.epub spine --pretty-print

content#

Display the content of a document by its manifest item ID.

Syntax:

epub-utils EPUB_FILE content ITEM_ID [--format FORMAT] [--pretty-print]

Description: Extracts and displays the content of a specific document within the EPUB, identified by its manifest item ID.

Supported formats: xml (default), raw, plain

Arguments: - ITEM_ID: The ID of the item as defined in the manifest

Examples:

# Show content with syntax highlighting
epub-utils book.epub content chapter1

# Get raw HTML/XHTML
epub-utils book.epub content intro --format raw

# Extract plain text (no HTML tags)
epub-utils book.epub content chapter2 --format plain

# Show content with pretty formatting
epub-utils book.epub content chapter1 --pretty-print

Finding item IDs:

# First check the manifest for available IDs
epub-utils book.epub manifest | grep 'id='

# Then extract specific content
epub-utils book.epub content found_id --format plain

files#

List all files in the EPUB archive with metadata, or display content of a specific file.

Syntax:

epub-utils EPUB_FILE files [FILE_PATH] [--format FORMAT] [--pretty-print]

Description: When used without a file path, provides detailed information about all files contained within the EPUB archive, including sizes, compression ratios, and modification dates.

When used with a file path, displays the content of the specified file within the EPUB archive.

Supported formats:

For file listing: table (default), raw
For file content: raw, xml (default), plain, kv

Arguments: - FILE_PATH (optional): Path to a specific file within the EPUB archive

Supported formats: table (default), raw

Examples:

# List all files in table format (default)
epub-utils book.epub files

# Get simple file list
epub-utils book.epub files --format raw

# Count total files
epub-utils book.epub files --format raw | wc -l

# Display content of a specific XHTML file
epub-utils book.epub files OEBPS/chapter1.xhtml

# Display XHTML file in different formats
epub-utils book.epub files OEBPS/chapter1.xhtml --format raw
epub-utils book.epub files OEBPS/chapter1.xhtml --format xml --pretty-print
epub-utils book.epub files OEBPS/chapter1.xhtml --format plain

# Display non-XHTML files (CSS, etc.)
epub-utils book.epub files OEBPS/styles/main.css

Key differences from content command:

files uses file paths within the EPUB archive
content uses manifest item IDs
files can access any file, including CSS, XML, and image files
content only accesses files listed in the manifest

Sample table output:

File Information for book.epub
┌────────────────────────────────────────┬──────────┬──────────────┬─────────────────────┐
│ Path                                   │ Size     │ Compressed   │ Modified            │
├────────────────────────────────────────┼──────────┼──────────────┼─────────────────────┤
│ META-INF/container.xml                 │ 230 B    │ 140 B        │ 2021-01-01 10:00:00│
│ OEBPS/content.opf                      │ 2.1 KB   │ 856 B        │ 2021-01-01 10:00:00│
│ OEBPS/Text/chapter01.xhtml             │ 12.4 KB  │ 3.2 KB       │ 2021-01-01 10:00:00│
└────────────────────────────────────────┴──────────┴──────────────┴─────────────────────┘

Format Options#

Most commands support the --format and --pretty-print options to control output formatting:

xml (default for most commands): Syntax-highlighted, formatted XML output
raw: Unformatted content exactly as stored in the EPUB
kv (metadata command only): Key-value pairs suitable for shell scripting
plain (content command only): Plain text with HTML tags stripped
table (files command only): Formatted table with aligned columns

Pretty Print Option#

The --pretty-print (or -pp) option formats XML output with proper indentation and structure:

# Default output (with syntax highlighting but compact)
epub-utils book.epub metadata

# Pretty-printed output (with proper indentation)
epub-utils book.epub metadata --pretty-print

# Combine with raw format for clean, formatted XML
epub-utils book.epub package --format raw --pretty-print

Note: The pretty-print option applies to both xml and raw formats, but has no effect on kv, plain, or table formats.

Exit Codes#

epub-utils uses standard exit codes:

0: Success
1: General error (file not found, invalid EPUB, etc.)
2: Command line usage error

Examples can check exit codes for error handling:

if epub-utils book.epub metadata >/dev/null 2>&1; then
    echo "EPUB is valid"
else
    echo "EPUB has issues"
fi

Environment Variables#

epub-utils respects these environment variables:

NO_COLOR: Disable color output when set to any value
FORCE_COLOR: Force color output even when not outputting to a terminal

Examples:

# Disable colors
NO_COLOR=1 epub-utils book.epub metadata

# Force colors in pipes
FORCE_COLOR=1 epub-utils book.epub metadata | less -R

Common Usage Patterns#

Validation Workflow#

#!/bin/zsh
# validate-epub.sh - Basic EPUB validation

epub_file="$1"

echo "Validating: $epub_file"

# Check container
if ! epub-utils "$epub_file" container >/dev/null 2>&1; then
    echo "❌ Invalid container"
    exit 1
fi

# Check package
if ! epub-utils "$epub_file" package >/dev/null 2>&1; then
    echo "❌ Invalid package"
    exit 1
fi

# Check required metadata
metadata=$(epub-utils "$epub_file" metadata --format kv 2>/dev/null)
if ! echo "$metadata" | grep -q "^title:"; then
    echo "⚠️  Missing title"
fi

if ! echo "$metadata" | grep -q "^creator:"; then
    echo "⚠️  Missing author"
fi

echo "✅ EPUB structure is valid"

Metadata Extraction#

#!/bin/zsh
# extract-metadata.sh - Extract metadata to CSV

echo "filename,title,author,language,publisher" > metadata.csv

for epub in *.epub; do
    if [[ -f "$epub" ]]; then
        metadata=$(epub-utils "$epub" metadata --format kv 2>/dev/null)

        title=$(echo "$metadata" | grep "^title:" | cut -d' ' -f2- | tr ',' ';')
        author=$(echo "$metadata" | grep "^creator:" | cut -d' ' -f2- | tr ',' ';')
        language=$(echo "$metadata" | grep "^language:" | cut -d' ' -f2-)
        publisher=$(echo "$metadata" | grep "^publisher:" | cut -d' ' -f2- | tr ',' ';')

        echo "$epub,$title,$author,$language,$publisher" >> metadata.csv
    fi
done

Content Analysis#

#!/bin/zsh
# analyze-content.sh - Analyze EPUB content structure

epub_file="$1"

echo "Content Analysis for: $epub_file"
echo "=================================="

# Get content files from manifest
content_ids=$(epub-utils "$epub_file" manifest --format raw | \
             grep 'media-type="application/xhtml+xml"' | \
             sed 's/.*id="\([^"]*\)".*/\1/')

total_words=0

for content_id in $content_ids; do
    if word_count=$(epub-utils "$epub_file" content "$content_id" --format plain 2>/dev/null | wc -w); then
        echo "Content ID '$content_id': $word_count words"
        total_words=$((total_words + word_count))
    fi
done

echo "=================================="
echo "Total words: $total_words"

Error Handling#

Always handle errors when using epub-utils in scripts:

# Check if file exists first
if [[ ! -f "$epub_file" ]]; then
    echo "Error: File '$epub_file' not found" >&2
    exit 1
fi

# Capture and handle command errors
if ! output=$(epub-utils "$epub_file" metadata --format kv 2>&1); then
    echo "Error processing EPUB: $output" >&2
    exit 1
fi

# Check for specific issues
if [[ -z "$output" ]]; then
    echo "Warning: No metadata found" >&2
fi

Performance Tips#

Use raw format for large-scale processing to avoid syntax highlighting overhead
Pipe efficiently to avoid unnecessary intermediate files
Process files in parallel when handling many EPUBs
Cache results when running the same command multiple times

# Efficient parallel processing
find . -name "*.epub" | xargs -n 1 -P 4 -I {} \
    zsh -c 'echo "{}: $(epub-utils "{}" metadata --format kv | grep "^title:" | cut -d" " -f2-)"'

Troubleshooting#

Common Issues and Solutions#

“Invalid value for ‘PATH’: File does not exist”: Check the file path and ensure the EPUB file exists.
“ParseError: Unable to parse container.xml”: The EPUB file may be corrupted. Verify it’s a valid ZIP file.
“Content with id ‘X’ not found”: Check available content IDs using the manifest command first.
No color output: Ensure your terminal supports colors and check the NO_COLOR environment variable.
Large file performance: Use --format raw for better performance with large files.