iText is a free and open source library for creating and manipulating PDF files in Java. iText has been ported to the .NET Framework under the name iTextSharp. iTextSharp is written in C# and it has a separate codebase, but it is synchronized to iText releases. iText is a tool that focuses on the automation side of things.
When to use iText?
Typically, iText is used in projects that have one of the following requirements:
- The content isn’t available in advance: it’s calculated based on user input or real-time database information.
- The PDF files can’t be produced manually due to the massive volume of content: a large number of pages or documents.
- Documents need to be created in unattended mode, in a batch process.
- The content needs to be customized or personalized; for instance, the name of the end user has to be stamped on a number of pages.
Supported by iText | |
Partly supported by iText |
[advt]iText is a PDF library
iText is an API that was developed to allow developers to do the following (and much more):
- Generate documents and reports based on data from an XML file or a database
- Create maps and books, exploiting numerous interactive features available in PDF
- Add bookmarks, page numbers,watermarks, and other features to existing PDF documents
- Split or concatenate pages from existing PDF files
- Fill out interactive forms
- Serve dynamically generated or manipulated PDF documents to a web browser
iText is not an end-user tool. You have to build iText into your own applications so that you can automate the PDF creation and manipulation process.
Creating PDFs
Document
and the PdfWriter
class, you can create PDF documents from scratch from a database, an XML file, or any other data source. You can do this in three different ways:- using high-level objects such as
Chunk
,Phrase
,Paragraph
,List
, and so on. These objects are often referred to as iText’s basic building blocks. - using low-level functionality. This is done with
PdfContentByte
, a class that consists of a series of methods that map to every operator and operand available in Adobe’s imaging model. This class also has numerous convenience methods to draw arcs, circles, rectangles and text at absolute positions. - Using
PdfGraphics2D
which is iText’s implementation of the abstractGraphics2D
class in Java (not available in iTextSharp).
iText ships with a plethora of classes that support ecnryption, different image types, color spaces, fonts. There’s functionality to enhance the accessibility of the PDF file, support for the integration of Flash apps into the PDF, and so on.iText can convert an XML or an HTML file to PDF, but only on a very basic level. Converting documents from one format to another is outside the scope of iText. And no: iText does not convert Word documents to PDF!
Updating PDFs
You always need a PdfReader
instance to access an existing document. You can use this reader in combination with PdfStamper
to stamp extra content on the existing PDF document: page numbers, a watermark, annotations, and so on. PdfStamper
is also the class you’ll use to fill out interactive forms. iText has almost complete support for AcroForms, but as soon as you have a form involving the XML Forms Architecture, the possibilities are limited.
You can split and merge PDF documents with PdfCopy
, PdfSmartCopy
, PdfCopyFields
, and even using PdfImportedPage
objects in combination with PdfWriter
or PdfStamper
.
Converting a PDF document to another format is outside the scope of iText, but you can convert a PDF to XML if the PDF was tagged and contains a structure tree. Depending on how the PDF was created, you can also extract plain text from a page.
iText can also be used to sign existing PDF documents, as well as to encrypt them.
Reading PDFs
iText isn’t a PDF viewer, iText can’t convert PDF to an image, nor can iText be used to print a PDF, but the PdfReader
class can give you access to the objects that form a PDF document and to the the content stream of each page. This content stream can be parsed and if the content wasn’t added as rasterized text, you can convert a page to plain text. Note that iText doesn’t do OCR.
Be the first to comment