Saturday, October 18, 2025
HomeLanguagesJavaJava Program to Extract Paragraphs From a Word Document

Java Program to Extract Paragraphs From a Word Document

The article demonstrates how to extract paragraphs from a word document using the getParagraphs() method of XWPFDocument class provided by the Apache POI package. Apache POI is a project developed and maintained by Apache Software Foundation that provides libraries to perform numerous operations on Microsoft office files using java. 

To extract paragraphs from a word file, the essential requirement is to import the following library of Apache.

poi-ooxml.jar

Approach

  1. Formulate the path of the word document
  2. Create a FileInputStream and XWPFDocument object for the word document.
  3. Retrieve the list of paragraphs using the getParagraphs() method.
  4. Iterate through the list of paragraphs to print it.

Implementation

  • Step 1: Getting the path of the current working directory where the word document is located.
  • Step 2: Creating a file object with the above-specified path.
  • Step 3: Creating a document object for the word document.
  • Step 4: Using the getParagraphs() method to retrieve the paragraphs list from the word file.
  • Step 5: Iterating through the list of paragraphs
  • Step 6: Printing the paragraphs
  • Step 7: Closing the connections

Sample Input

The content of the Word document is as follows:

Implementation

Example

Java




// Java program to extract paragraphs from a Word Document
  
// Importing IO package for basic file handling
import java.io.*;
import java.util.List;
// Importing Apache POI package
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
  
// Main class to extract paragraphs from word document
public class GFG {
  
    // Main driver method
    public static void main(String[] args) throws Exception
    {
  
        // Step 1: Getting path of the current working
        // directory where the word document is located
        String path = System.getProperty("user.dir");
        path = path + File.separator + "WordFile.docx";
  
        // Step 2: Creating a file object with the above
        // specified path.
        FileInputStream fin = new FileInputStream(path);
  
        // Step 3: Creating a document object for the word
        // document.
        XWPFDocument document = new XWPFDocument(fin);
  
        // Step 4: Using the getParagraphs() method to
        // retrieve the list of paragraphs from the word
        // file.
        List<XWPFParagraph> paragraphs
            = document.getParagraphs();
  
        // Step 5: Iterating through the list of paragraphs
        for (XWPFParagraph para : paragraphs) {
  
            // Step 6: Printing the paragraphs
            System.out.println(para.getText() + "\n");
        }
  
        // Step 7: Closing the connections
        document.close();
    }
}


Output

Dominic
Dominichttp://wardslaus.com
infosec,malicious & dos attacks generator, boot rom exploit philanthropist , wild hacker , game developer,
RELATED ARTICLES

Most Popular

Dominic
32361 POSTS0 COMMENTS
Milvus
88 POSTS0 COMMENTS
Nango Kala
6728 POSTS0 COMMENTS
Nicole Veronica
11892 POSTS0 COMMENTS
Nokonwaba Nkukhwana
11954 POSTS0 COMMENTS
Shaida Kate Naidoo
6852 POSTS0 COMMENTS
Ted Musemwa
7113 POSTS0 COMMENTS
Thapelo Manthata
6805 POSTS0 COMMENTS
Umr Jansen
6801 POSTS0 COMMENTS