Class PDFStreamParser

java.lang.Object
org.apache.pdfbox.pdfparser.BaseParser
org.apache.pdfbox.pdfparser.PDFStreamParser

public class PDFStreamParser extends BaseParser
This will parse a PDF byte stream and extract operands and such.
Version:
$Revision$
Author:
Ben Litchfield
  • Constructor Details

    • PDFStreamParser

      public PDFStreamParser(InputStream stream, RandomAccess raf, boolean forceParsing) throws IOException
      Constructor that takes a stream to parse.
      Parameters:
      stream - The stream to read data from.
      raf - The random access file.
      forceParsing - flag to skip malformed or otherwise unparseable input where possible
      Throws:
      IOException - If there is an error reading from the stream.
      Since:
      Apache PDFBox 1.3.0
    • PDFStreamParser

      public PDFStreamParser(InputStream stream, RandomAccess raf) throws IOException
      Constructor that takes a stream to parse.
      Parameters:
      stream - The stream to read data from.
      raf - The random access file.
      Throws:
      IOException - If there is an error reading from the stream.
    • PDFStreamParser

      public PDFStreamParser(PDStream stream) throws IOException
      Constructor.
      Parameters:
      stream - The stream to parse.
      Throws:
      IOException - If there is an error initializing the stream.
    • PDFStreamParser

      public PDFStreamParser(COSStream stream, boolean forceParsing) throws IOException
      Constructor.
      Parameters:
      stream - The stream to parse.
      forceParsing - flag to skip malformed or otherwise unparseable input where possible
      Throws:
      IOException - If there is an error initializing the stream.
      Since:
      Apache PDFBox 1.3.0
    • PDFStreamParser

      public PDFStreamParser(COSStream stream) throws IOException
      Constructor.
      Parameters:
      stream - The stream to parse.
      Throws:
      IOException - If there is an error initializing the stream.
  • Method Details

    • parse

      public void parse() throws IOException
      This will parse the tokens in the stream. This will close the stream when it is finished parsing.
      Throws:
      IOException - If there is an error while parsing the stream.
    • getTokens

      public List<Object> getTokens()
      This will get the tokens that were parsed from the stream.
      Returns:
      All of the tokens in the stream.
    • close

      public void close() throws IOException
      This will close the underlying pdfSource object.
      Throws:
      IOException - If there is an error releasing resources.
    • getTokenIterator

      public Iterator<Object> getTokenIterator()
      This will get an iterator which can be used to parse the stream one token after the other.
      Returns:
      an iterator to get one token after the other
    • readOperator

      protected String readOperator() throws IOException
      This will read an operator from the stream.
      Returns:
      The operator that was read from the stream.
      Throws:
      IOException - If there is an error reading from the stream.
    • clearResources

      public void clearResources()
      Release all used resources.
      Overrides:
      clearResources in class BaseParser