Program to extract an image from a PDF using Java. The external jar file is required to import in the program. Below is the implementation for the same.
Algorithm:
- Extracting image using the APACHE PDF Box module.
- Load the existing PDF document using file io.
- Creating an object of PDFRenderer class.
- Rendering an image from the PDF document using the BufferedImage class.
- Writing the extracted image to the new file.
- Close the document.
Note: External files are required to download for performing the operation. For more documentation of the module used to refer to this.
Implementation:
Java
// Extracting Images from a PDF using java import java.io.*; import java.awt.image.BufferedImage; import javax.imageio.ImageIO; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.PDFRenderer; class GFG { public static void main(String[] args) throws Exception { // Existing PDF Document // to be Loaded using file io File newFile = new File( "C:/Documents/Lazyroar.pdf" ); PDDocument pdfDocument = PDDocument.load(newFile); // PDFRenderer class to be Instantiated // i.e. creating it's object PDFRenderer pdfRenderer = new PDFRenderer(pdfDocument); // Rendering an image // from the PDF document // using BufferedImage class BufferedImage img = pdfRenderer.renderImage( 0 ); // Writing the extracted // image to a new file ImageIO.write( img, "JPEG" , new File( "C:/Documents/Lazyroar.png" )); System.out.println( "Image has been extracted successfully" ); // Closing the PDF document pdfDocument.close(); } } |
PDF before execution:
Image after extraction: