Package org.apache.pdfbox.multipdf
Class Splitter
java.lang.Object
org.apache.pdfbox.multipdf.Splitter
Split a document into several other documents.
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate Map<COSDictionary, COSDictionary> private PDDocumentprivate intprivate List<PDDocument> private Map<PDPageDestination, PDPage> private intprivate static final org.apache.commons.logging.Logprivate Map<COSDictionary, COSDictionary> private PDDocumentprivate intprivate intprivate Map<COSDictionary, COSDictionary> -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate voidcloneIDTree(PDStructureTreeRoot srcStructTree, PDStructureTreeRoot destStructTree) private voidcloneRoleMap(PDStructureTreeRoot srcStructTree, PDStructureTreeRoot destStructTree) private voidcloneStructureTree(PDDocument destinationDocument) Clone the structure tree from the source to the current destination document.private voidcloneTreeElement(Map<Integer, COSObjectable> srcNumberTreeAsMap, Map<Integer, COSObjectable> dstNumberTreeAsMap, int sp) protected PDDocumentCreate a new document to write the split contents to.private voidHelper method for creating new documents at the appropriate pages.private voidfixDestinations(PDDocument destinationDocument) Replace the page destinations, if the source and destination pages are in the target document.protected final PDDocumentThe source PDF document.protected final PDDocumentThe source PDF document.private voidprocessAnnotations(PDPage imported) Clone all annotations because of changes possibly made, and because the structure tree is cloned.protected voidprocessPage(PDPage page) Interface to start processing a new page.private voidInterface method to handle the start of the page processing.private voidprocessResources(PDResources res, Map<Integer, COSObjectable> srcNumberTreeAsMap, Map<Integer, COSObjectable> dstNumberTreeAsMap, Set<COSDictionary> visited) voidsetEndPage(int end) This will set the end page.voidsetSplitAtPage(int split) This will tell the splitting algorithm where to split the pages.voidsetStartPage(int start) This will set the start page.voidsetStreamCacheCreateFunction(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction) Set the current function to be used to create an instance of stream cache.split(PDDocument document) This will take a document and split into several other documents.protected booleansplitAtPage(int pageNumber) Check if it is necessary to create a new document.
-
Field Details
-
LOG
private static final org.apache.commons.logging.Log LOG -
sourceDocument
-
currentDestinationDocument
-
splitLength
private int splitLength -
startPage
private int startPage -
endPage
private int endPage -
destinationDocuments
-
pageDictMap
-
structDictMap
-
annotDictMap
-
destToFixMap
-
idSet
-
roleSet
-
currentPageNumber
private int currentPageNumber -
streamCacheCreateFunction
-
-
Constructor Details
-
Splitter
public Splitter()
-
-
Method Details
-
getStreamCacheCreateFunction
- Returns:
- the current function to be used to create an instance of stream cache.
-
setStreamCacheCreateFunction
public void setStreamCacheCreateFunction(RandomAccessStreamCache.StreamCacheCreateFunction streamCacheCreateFunction) Set the current function to be used to create an instance of stream cache.- Parameters:
streamCacheCreateFunction- the current function to be used to create an instance of stream cache.
-
split
This will take a document and split into several other documents.- Parameters:
document- The document to split.- Returns:
- A list of all the split documents. These should all be saved before closing any documents, including the source document. Any further operations should be made after reloading them, to avoid problems due to resource sharing. For the same reason, they should not be saved with encryption.
- Throws:
IOException- If there is an IOError
-
fixDestinations
Replace the page destinations, if the source and destination pages are in the target document. This must be called after all pages (and its annotations) are processed.- Parameters:
destinationDocument-
-
cloneStructureTree
Clone the structure tree from the source to the current destination document.- Parameters:
destinationDocument-- Throws:
IOException
-
cloneIDTree
private void cloneIDTree(PDStructureTreeRoot srcStructTree, PDStructureTreeRoot destStructTree) throws IOException - Throws:
IOException
-
cloneRoleMap
-
cloneTreeElement
private void cloneTreeElement(Map<Integer, COSObjectable> srcNumberTreeAsMap, Map<Integer, COSObjectable> dstNumberTreeAsMap, int sp) -
processResources
private void processResources(PDResources res, Map<Integer, COSObjectable> srcNumberTreeAsMap, Map<Integer, throws IOExceptionCOSObjectable> dstNumberTreeAsMap, Set<COSDictionary> visited) - Throws:
IOException
-
setSplitAtPage
public void setSplitAtPage(int split) This will tell the splitting algorithm where to split the pages. The default is 1, so every page will become a new document. If it was two then each document would contain 2 pages. If the source document had 5 pages it would split into 3 new documents, 2 documents containing 2 pages and 1 document containing one page.- Parameters:
split- The number of pages each split document should contain.- Throws:
IllegalArgumentException- if the page is smaller than one.
-
setStartPage
public void setStartPage(int start) This will set the start page.- Parameters:
start- the 1-based start page- Throws:
IllegalArgumentException- if the start page is smaller than one.
-
setEndPage
public void setEndPage(int end) This will set the end page.- Parameters:
end- the 1-based end page- Throws:
IllegalArgumentException- if the end page is smaller than one.
-
processPages
Interface method to handle the start of the page processing.- Throws:
IOException- If an IO error occurs.
-
createNewDocumentIfNecessary
Helper method for creating new documents at the appropriate pages.- Throws:
IOException- If there is an error creating the new document.
-
splitAtPage
protected boolean splitAtPage(int pageNumber) Check if it is necessary to create a new document. By default a split occurs at every page. If you wanted to split based on some complex logic then you could override this method. For example.protected void splitAtPage() { // will split at pages with prime numbers only return isPrime(pageNumber); }- Parameters:
pageNumber- the 0-based page number to be checked as splitting page- Returns:
- true If a new document should be created.
-
createNewDocument
Create a new document to write the split contents to.- Returns:
- the newly created PDDocument.
- Throws:
IOException- If there is an problem creating the new document.
-
processPage
Interface to start processing a new page.- Parameters:
page- The page that is about to get processed.- Throws:
IOException- If there is an error creating the new document.
-
processAnnotations
Clone all annotations because of changes possibly made, and because the structure tree is cloned.- Parameters:
imported-- Throws:
IOException
-
getSourceDocument
The source PDF document.- Returns:
- the pdf to be split
-
getDestinationDocument
The source PDF document.- Returns:
- current destination pdf
-