Apache POI XWPF
Java API for Word OOXML Documents
Open Source solution to Create, Read, Edit and Convert Microsoft Word DOCX files in Java applications.
What is Apache POI XWPF?
Apache POI XWPF provides the functionality to read & write Microsoft Word 2007 DOCX file format. XWPF has a fairly stable core API, providing access to the main parts of a Word DOCX file. It can be used for basic & specific text extraction, manipulation of header & footer, text manipulation & styling features.
Apache POI XWPF is more renowned for Microsoft Word file generation & document editing, formatting of text & paragraphs, image insertion, table creation & parsing, mail merge features, management of form elements, and much more.
Getting Started with Apache POI XWPF
First of all, you need to have the Java Development Kit (JDK) installed on your system. If you already have it then proceed to the Apache POI's download page to get the latest stable release in an archive. Extract the contents of the ZIP file in any directory from where the required libraries can be linked to your Java program. That is all!
Referencing Apache POI in your Maven-based Java project is even simpler. All you need is to add the following dependency in your pom.xml and let your IDE fetch and reference the Apache POI Jar files.
Apache POI Maven Dependency
<!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.1.0</version>
</dependency>
Generate & Edit Word Documents using Java API
Apache POI XWPF enables the software programmers to create new Word Documents in DOCX file format. Developers can also load an existing Microsoft Word DOCX file to edit it according to their application needs. It allows you to add new paragraphs, insert text, apply text alignment & borders, change text styling, and more.
Generate a DOCX file from scratch
// initialize a blank document
XWPFDocument document = new XWPFDocument();
// create a new file
FileOutputStream out = new FileOutputStream(new File("document.docx"));
// create a new paragraph paragraph
XWPFParagraph paragraph = document.createParagraph();
XWPFRun run = paragraph.createRun();
run.setText("File Format Developer Guide - " +
"Learn about computer files that you come across in " +
"your daily work at: www.fileformat.com ");
document.write(out);
out.close();
Add Paragraph, Image & Table to Word Documents
Apache POI XWPF allows the developers to add paragraphs & images to Word documents. The API also provides the feature to add tables to DOCX documents while making it possible to create simple and nested tables with user-defined data.
Create a new DOCX file with a table
// initialize a blank document
XWPFDocument document = new XWPFDocument();
// create a new file
FileOutputStream out = new FileOutputStream(new File("table.docx"));
// create a new table
XWPFTable table = document.createTable();
// create first row
XWPFTableRow tableRowOne = table.getRow(0);
tableRowOne.getCell(0).setText("Serial No");
tableRowOne.addNewTableCell().setText("Products");
tableRowOne.addNewTableCell().setText("Formats");
// create second row
XWPFTableRow tableRowTwo = table.createRow();
tableRowTwo.getCell(0).setText("1");
tableRowTwo.getCell(1).setText("Apache POI XWPF");
tableRowTwo.getCell(2).setText("DOCX, HTML, FO, TXT, PDF");
// create third row
XWPFTableRow tableRowThree = table.createRow();
tableRowThree.getCell(0).setText("2");
tableRowThree.getCell(1).setText("Apache POI HWPF");
tableRowThree.getCell(2).setText("DOC, HTML, FO, TXT");
document.write(out);
out.close();
Extract Text from Word OOXML Document
Apache POI XWPF provides the specialized class to extract data from Microsoft Word DOCX documents with just a few lines of code. In the same way, it can also extract headings, footnotes, table data, and so on from a Word file.
Extract text from a Word file
// load DOCX file
FileInputStream fis = new FileInputStream("document.docx");
// open file
XWPFDocument file = new XWPFDocument(OPCPackage.open(fis));
// read text
XWPFWordExtractor ext = new XWPFWordExtractor(file);
// display text
System.out.println(ext.getText());
Add Custom Header & Footer to DOCX Documents
Header and footer are an important part of the Word document as those usually contain extra information such as dates, page numbers, author's name, and footnotes, which help in keeping longer documents organized and easier to read. Apache POI XWPF allows Java developers to add custom headers and footers to Word documents.
Manage Custom Header & Footer in Word DOCX File
public class HeaderFooterTable {
public static void main(String[] args) throws IOException {
try (XWPFDocument doc = new XWPFDocument()) {
// Create a header with a 1 row, 3 column table
XWPFHeader hdr = doc.createHeader(HeaderFooterType.DEFAULT);
XWPFTable tbl = hdr.createTable(1, 3);
// Set the padding around text in the cells to 1/10th of an inch
int pad = (int) (.1 * 1440);
tbl.setCellMargins(pad, pad, pad, pad);
// Set table width to 6.5 inches in 1440ths of a point
tbl.setWidth((int) (6.5 * 1440));
CTTbl ctTbl = tbl.getCTTbl();
CTTblPr ctTblPr = ctTbl.addNewTblPr();
CTTblLayoutType layoutType = ctTblPr.addNewTblLayout();
layoutType.setType(STTblLayoutType.FIXED);
BigInteger w = new BigInteger("3120");
CTTblGrid grid = ctTbl.addNewTblGrid();
for (int i = 0; i < 3; i++) {
CTTblGridCol gridCol = grid.addNewGridCol();
gridCol.setW(w);
}
// Add paragraphs to the cells
XWPFTableRow row = tbl.getRow(0);
XWPFTableCell cell = row.getCell(0);
XWPFParagraph p = cell.getParagraphArray(0);
XWPFRun r = p.createRun();
r.setText("header left cell");
cell = row.getCell(1);
p = cell.getParagraphArray(0);
r = p.createRun();
r.setText("header center cell");
cell = row.getCell(2);
p = cell.getParagraphArray(0);
r = p.createRun();
r.setText("header right cell");
// Create a footer with a Paragraph
XWPFFooter ftr = doc.createFooter(HeaderFooterType.DEFAULT);
p = ftr.createParagraph();
r = p.createRun();
r.setText("footer text");
try (OutputStream os = new FileOutputStream(new File("headertable.docx"))) {
doc.write(os);
}
}
}
}