Open Source Python Library for Converting PDF Files
Free Python API allows Developers to Export, Rotates, Merge and Concatenate PDF Files, Extract Data & Elements from PDFs.
pdfrw is an open source pure Python library that gives software developers to read and write PDF files without installing any external special software. pdfrw programming library is very simple to use and the source code is well documented, very simple, and easy to understand. The library has included proper Unicode support for text strings in PDFs as well as the fastest pure Python PDF parser.
pdfrw library includes support for several important PDF operations such as merging PDFs, modifying metadata, concatenating multiple PDFs together, extract images, PDF printing, Rotates PDF pages, Creates a new PDF, Adds a watermark PDF image and many more.
At A Glance
An overview of PDFParser features.
- Create PDF
- Edit PDF
- Splitting PDFs
- Merging PDFs
- Rotating PDFS
- Concatenating PDFs
- Embedding hyperlinks
- Insert circles
- Add complex shapes
- Unicode support
- Data extraction
- Text kerning
- Font embedding
- Encrypt PDF
- PDF form
- Embedding images
pdfrw is tested with Python 2.6, 2.7, 3.3, 3.4, 3.5, and 3.6.
- Python 2.6 & higher
Getting Started with pdfrw
pdfrw requires Python 2.6, 2.7, 3.3, 3.4, 3.5, and 3.6. You can install pdfrw using pip. Please use the following command to install it.
Install pdfrw via pip
python -m pip install pdfrw
Create PDF Documents via Python Library
pdfrw library provides software developers the capability to create PDF documents inside their own Python applications with just a couple of lines of code. The library also provides support for accessing and modifying existing PDF files. You can easily insert new pages as well as graphics component or text elements into the existence PDF. pdfrw library provides support to find the pages in PDF files you read in, and to write a set of pages back out to a new PDF file.
Reading PDF Files via Python
pdfrw library gives software developers to easily access and read different parts of PDF documents inside Python applications. It gives easy access to the entire PDF document. The library supports retrieving file information, size, and more. It creates a special attribute named pages, which allows users to list all the pages of a PDF document. It lets you extract a document information object that you can use to pull out information like author, title, etc.
Adding or Modifying Metadata
pdfrw allows software developers to add or modify metadata of PDF files inside their own Python applications. You can alter a single metadata item in a PDF, and writes the result to a new PDF as well as can make include multiple files, concatenate them after adding some nonsensical metadata to the output PDF file.
Splitting PDF Documents
pdfrw allows software developers to programmatically Split PDF Documents documents inside their applications. A user may require extracting a specific part of a PDF book or divided it into multiple PDFs instead of storing them in one file. It is very easy with pdfrw library, you just need to provide an input PDF file path, the number of pages that you want to extract, and the output path.