1. Products
  2.   PDF
  3.   C++
  4.   PDFio
 
  

Free C++ PDF Library for Reading & Writing PDF Files

A Simple Open Source C-based PDF Library allows to Read, Write and Manipulate PDF Documents. It Supports Image to PDF Conversion, Extracting Contents, Stream-based PDF Reading and More, inside C apps.

What is PDFio Library?

PDFio is a lightweight, open-source C library designed specifically for reading and writing PDF files without the complexity of rendering or viewing. Created by Michael R Sweet and licensed under the Apache License Version 2.0, this library provides developers with a straightforward approach to PDF manipulation that focuses on content creation, extraction, and transformation rather than visual presentation. The library is very easy to handle and provided several ways for manipulating processing PDF files such as PDF page extraction and merging, automated reports from databases, watermarking and stamping, certificate and diploma creation, font embedding for archival, metadata modification and many more.

PDFio focuses on the essential I/O operations, making it perfect for applications that need to extract data from PDFs or create new ones without the overhead of more complex libraries. The library supports powerful conversion features and enables software developers to convert markdown to PDF, image to PDF, HTML to PDF as well as barcode generation in PDFs inside C applications. Its focus on core functionality without the overhead of rendering makes it ideal for server-side applications, command-line tools, and embedded systems where PDF creation or processing is needed without visual display. Whether you're building document management systems, creating automated report generators, or developing tools for PDF analysis, PDFio offers the essential functionality needed for professional PDF handling in C applications.

Previous Next

Getting Started with PDFio

TThe recommend way to install PDFio is using NuGet. Please use the following command for a smooth installation.

Install PDFio via NuGet

Install-Package pdfio_native 

Install PDFio via GitHub

 git clone  https://github.com/michaelrsweet/pdfio.git

You can also install it manually; download the latest release files directly from GitHub repository.

Creating a PDF from Scratch via C

The open source PDFio library has included complete support for creating new PDF document from scratch inside C applications. Creating a new PDF document is equally straightforward and with just a couple of lines of C code software developers can add new pages, draw text and simple shapes, add images (JPEG, PNG formats), apply colors and transformations, build tables and structured layouts and much more. Here is simple example that demonstrates how to create a new PDF document with custom page size and content using C code.

How to Create a Simple PDF Document with Custom Page Size & Content via C Library?

#include 
#include 

int main(void) {
    pdfio_file_t *pdf;
    pdfio_dict_t *dict;
    pdfio_stream_t *page;
    pdfio_obj_t *font;
    pdfio_rect_t media_box = {0.0, 0.0, 612.0, 792.0};  // US Letter
    pdfio_rect_t crop_box = {36.0, 36.0, 576.0, 756.0}; // 0.5" margins
    
    // Create PDF file
    pdf = pdfioFileCreate("output.pdf", "2.0", 
                          &media_box, &crop_box,
                          NULL, NULL);
    
    if (!pdf) {
        fprintf(stderr, "Unable to create PDF\n");
        return 1;
    }
    
    // Create a font object (Helvetica)
    font = pdfioFileCreateFontObjFromBase(pdf, "Helvetica");
    
    // Create page dictionary
    dict = pdfioDictCreate(pdf);
    pdfioPageDictAddFont(dict, "F1", font);
    
    // Create the page
    page = pdfioFileCreatePage(pdf, dict);
    
    // Draw text
    pdfioContentSetFillColorDeviceGray(page, 0.0);  // Black color
    pdfioContentTextBegin(page);
    pdfioContentSetTextFont(page, "F1", 24.0);
    pdfioContentTextMoveTo(page, 100.0, 700.0);
    pdfioContentTextShow(page, false, "Hello, PDFio!");
    pdfioContentTextEnd(page);
    
    // Close page and file
    pdfioStreamClose(page);
    pdfioFileClose(pdf);
    
    printf("PDF created successfully!\n");
    return 0;
}

Image to PDF Conversion via C Library

Converting images to PDF format is one of the most common document processing tasks. The PDFio library makes this straightforward while providing fine control over page sizing, image placement, scaling, and metadata. Software developers can build a complete image-to-PDF converter that handles both single images and batch conversions. The following example demonstrates, how developers can converts one image to a PDF inside C apps.

How to Convert Single Image to a PDF via C Library?

#include 
#include 
#include 

bool create_pdf_from_image(const char *image_path, 
                           const char *pdf_path,
                           const char *title) {
    pdfio_file_t *pdf;
    pdfio_obj_t *image;
    pdfio_dict_t *page_dict;
    pdfio_stream_t *page_stream;
    pdfio_rect_t media_box;
    double img_width, img_height;
    double page_width, page_height;
    double scale_x, scale_y, scale;
    double x, y;
    
    // Create the image object first to get dimensions
    // We'll use a temporary PDF for this
    pdfio_rect_t temp_box = {0, 0, 612, 792};
    pdf = pdfioFileCreate(pdf_path, "2.0", &temp_box, NULL, NULL, NULL);
    
    if (!pdf) {
        fprintf(stderr, "Error: Cannot create PDF file\n");
        return false;
    }
    
    // Set PDF metadata
    if (title) {
        pdfioFileSetTitle(pdf, title);
    }
    pdfioFileSetAuthor(pdf, "PDFio Image Converter");
    pdfioFileSetCreator(pdf, "image2pdf v1.0");
    
    // Load and embed the image
    image = pdfioFileCreateImageObjFromFile(pdf, image_path, true);
    
    if (!image) {
        fprintf(stderr, "Error: Cannot load image file: %s\n", image_path);
        pdfioFileClose(pdf);
        return false;
    }
    
    // Get image dimensions in pixels
    img_width = pdfioImageGetWidth(image);
    img_height = pdfioImageGetHeight(image);
    
    printf("Image dimensions: %.0f x %.0f pixels\n", img_width, img_height);
    
    // Create page sized to image (assuming 72 DPI)
    // This creates a PDF page that exactly fits the image
    page_width = img_width;
    page_height = img_height;
    
    media_box.x1 = 0.0;
    media_box.y1 = 0.0;
    media_box.x2 = page_width;
    media_box.y2 = page_height;
    
    // Create page dictionary and add image resource
    page_dict = pdfioDictCreate(pdf);
    pdfioDictSetRect(page_dict, "MediaBox", &media_box);
    pdfioPageDictAddImage(page_dict, "IM1", image);
    
    // Create the page
    page_stream = pdfioFileCreatePage(pdf, page_dict);
    
    // Draw image at full size (no scaling)
    pdfioContentDrawImage(page_stream, "IM1", 0, 0, 
                         page_width, page_height);
    
    // Close and save
    pdfioStreamClose(page_stream);
    pdfioFileClose(pdf);
    
    printf("Successfully created PDF: %s\n", pdf_path);
    return true;
}

int main(int argc, char *argv[]) {
    if (argc < 3) {
        fprintf(stderr, "Usage: %s   [title]\n", 
                argv[0]);
        return 1;
    }
    
    const char *title = (argc > 3) ? argv[3] : NULL;
    
    if (!create_pdf_from_image(argv[1], argv[2], title)) {
        return 1;
    }
    
    return 0;
}
 

Read PDF & Extract Text via C Library

One of the most common tasks is reading and extracting text from a PDF file. The open source PDFio library makes this process intuitive by allowing developers to iterate through pages and their content streams. The library provides full support for extract useful information from existing PDFs, such as simple text, metadata, embedded fonts, numbers, images, grapes, names and more. The following example demonstrates how to open an existing PDF and extract basic information from it using C library.

How to Extract Simple Text from a PDF File via C Library?

#include 
#include 
#include 

// Callback function to handle text objects
static bool text_cb(const pdfio_text_t *text, void *data) {
    // Simply print the text to stdout
    printf("%s", text->utf8);
    return true;
}

int main(void) {
    pdfio_file_t *pdf;
    pdfio_page_t *page;

    // Open the PDF file for reading
    pdf = pdfioFileOpen("example.pdf", NULL, NULL, NULL);
    if (!pdf) {
        fprintf(stderr, "Failed to open example.pdf\n");
        return 1;
    }

    printf("PDF has %d pages.\n", pdfioFileGetNumPages(pdf));

    // Read the first page (pages are 1-indexed)
    page = pdfioFileGetPage(pdf, 1);
    if (!page) {
        fprintf(stderr, "Failed to read page 1\n");
        pdfioFileClose(pdf);
        return 1;
    }

    // Parse the page content and call our text_cb for each text object
    if (!pdfioPageParseContents(page, text_cb, NULL, NULL, NULL)) {
        fprintf(stderr, "Failed to parse page contents\n");
    }

    // Clean up
    pdfioFileClose(pdf);
    return 0;
}

Efficient, Stream-Based Reading

PDFio is designed for efficiency. When reading, it doesn't load the entire file into memory at once. Instead, it uses a pull-parsing approach, allowing you to process the file in chunks. This is crucial for handling very large PDFs without exhausting system memory. The pdfioPageParseContents function from the first example is a prime example of this stream-based model. It reads the page's content stream sequentially, emitting events (like our text callback) as it goes.