PDF::API2 - Create, modify, and examine PDF files
use PDF::API2;
# Create a blank PDF file
$pdf = PDF::API2->new();
# Open an existing PDF file
$pdf = PDF::API2->open('some.pdf');
# Add a blank page
$page = $pdf->page();
# Retrieve an existing page
$page = $pdf->open_page($page_number);
# Set the page size
$page->size('Letter');
# Add a built-in font to the PDF
$font = $pdf->font('Helvetica-Bold');
# Add an external TrueType font to the PDF
$font = $pdf->font('/path/to/font.ttf');
# Add some text to the page
$text = $page->text();
$text->font($font, 20);
$text->position(200, 700);
$text->text('Hello World!');
# Save the PDF
$pdf->save('/path/to/new.pdf');
my $pdf = PDF::API2->new(%options);
Create a new PDF.
The following options are available:
- file
If you will be saving the PDF to disk and already know the filename, you can
include it here to open the file for writing immediately. file may also be
a filehandle.
- compress
By default, most of the PDF will be compressed to save space. To turn this off
(generally only useful for testing or debugging), set compress to 0.
my $pdf = PDF::API2->open('/path/to/file.pdf', %options);
Open an existing PDF file.
The following option is available:
- compresscompress
By default, most of the PDF will be compressed to save space. To turn this off
(generally only useful for testing or debugging), set compress to 0.
$pdf->save('/path/to/file.pdf');
Write the PDF to disk and close the file. A filename is optional if one was
specified while opening or creating the PDF.
As a side effect, the document structure is removed from memory when the file is
saved, so it will no longer be usable.
$pdf->close();
Close an open file (if relevant) and remove the object structure from memory.
PDF::API2 contains circular references, so this call is necessary in
long-running processes to keep from running out of memory.
This will be called automatically when you save or stringify a PDF.
You should only need to call it explicitly if you are reading PDF
files and not writing them.
my $pdf = PDF::API2->from_string($pdf_string, %options);
Read a PDF document contained in a string.
The following option is available:
- compresscompress
By default, most of the PDF will be compressed to save space. To turn this off
(generally only useful for testing or debugging), set compress to 0.
my $string = $pdf->to_string();
Return the PDF document as a string.
As a side effect, the document structure is removed from memory when the string
is created, so it will no longer be usable.
$title = $pdf->title();
$pdf = $pdf->title($title);
Get/set/clear the document's title.
$author = $pdf->author();
$pdf = $pdf->author($author);
Get/set/clear the name of the person who created the document.
$subject = $pdf->subject();
$pdf = $pdf->subject($subject);
Get/set/clear the subject of the document.
$keywords = $pdf->keywords();
$pdf = $pdf->keywords($keywords);
Get/set/clear a space-separated string of keywords associated with the document.
$creator = $pdf->creator();
$pdf = $pdf->creator($creator);
Get/set/clear the name of the product that created the document prior to its
conversion to PDF.
$producer = $pdf->producer();
$pdf = $pdf->producer($producer);
Get/set/clear the name of the product that converted the original document to
PDF.
PDF::API2 fills in this field when creating a PDF.
$date = $pdf->created();
$pdf = $pdf->created($date);
Get/set/clear the document's creation date.
The date format is D:YYYYMMDDHHmmSSOHH'mm , where D: is a static prefix
identifying the string as a PDF date. The date may be truncated at any point
after the year. O is one of + , - , or Z , with the following HH'mm
representing an offset from UTC.
When setting the date, D: will be prepended automatically if omitted.
$date = $pdf->modified();
$pdf = $pdf->modified($date);
Get/set/clear the document's modification date. The date format is as described
in created above.
# Get all keys and values
%info = $pdf->info_metadata();
# Get the value of one key
$value = $pdf->info_metadata($key);
# Set the value of one key
$pdf = $pdf->info_metadata($key, $value);
Get/set/clear a key in the document's information dictionary. The standard keys
(title, author, etc.) have their own accessors, so this is primarily intended
for interacting with custom metadata.
Pass undef as the value in order to remove the key from the dictionary.
$xml = $pdf->xml_metadata();
$pdf = $pdf->xml_metadata($xml);
Get/set the document's XML metadata stream.
$version = $pdf->version($new_version);
Get/set the PDF version (e.g. 1.4).
$boolean = $pdf->is_encrypted();
Returns true if the opened PDF is encrypted.
$outline = $pdf->outlines();
Creates (if needed) and returns the document's outline tree, which is also known
as its bookmarks or the table of contents, depending on the PDF reader.
To examine or modify the outline tree, see the PDF::API2::Outline manpage.
$pdf = $pdf->open_action($page, $location, @args);
Set the destination in the PDF that should be displayed when the document is
opened.
$page may be either a page number or a page object. The other parameters are
as described in the PDF::API2::NamedDestination manpage.
$layout = $pdf->page_layout();
$pdf = $pdf->page_layout($layout);
Get/set the page layout that should be used when the PDF is opened.
$layout is one of the following:
- single_page (or undef)
Display one page at a time.
- one_column
Display the pages in one column (a.k.a. continuous).
- two_column_left
Display the pages in two columns, with odd-numbered pages on the left.
- two_column_right
Display the pages in two columns, with odd-numbered pages on the right.
- two_page_left
Display two pages at a time, with odd-numbered pages on the left.
- two_page_right
Display two pages at a time, with odd-numbered pages on the right.
# Get
$mode = $pdf->page_mode();
# Set
$pdf = $pdf->page_mode($mode);
Get/set the page mode, which describes how the PDF should be displayed when
opened.
$mode is one of the following:
# Get
%preferences = $pdf->viewer_preferences();
# Set
$pdf = $pdf->viewer_preferences(%preferences);
Get or set PDF viewer preferences, as described in
the PDF::API2::ViewerPreferences manpage.
# Add a page to the end of the document
$page = $pdf->page();
# Insert a page before the specified page number
$page = $pdf->page($page_number);
Returns a new page object. By default, the page is added to the end
of the document. If you include an existing page number, the new page
will be inserted in that position, pushing existing pages back.
If $page_number is -1, the new page is inserted as the second-last page; if
$page_number is 0, the new page is inserted as the last page.
$page = $pdf->open_page($page_number);
Returns the the PDF::API2::Page manpage object of page $page_number , if it exists.
If $page_number is 0 or -1, it will return the last page in the document.
$page = $pdf->import_page($source_pdf, $source_page_num, $target_page_num);
Imports a page from $source_pdf and adds it to the specified position in
$pdf .
If $source_page_num or $target_page_num is 0 or -1, the last page in the
document is used.
Note: If you pass a page object instead of a page number for
$target_page_num , the contents of the page will be merged into the existing
page.
Example:
my $pdf = PDF::API2->new();
my $source = PDF::API2->open('source.pdf');
# Add page 2 from the source PDF as page 1 of the new PDF
my $page = $pdf->import_page($source, 2);
$pdf->save('sample.pdf');
Note: You can only import a page from an existing PDF file.
$xobject = $pdf->embed_page($source_pdf, $source_page_number);
Returns a Form XObject created by extracting the specified page from a
$source_pdf .
This is useful if you want to transpose the imported page somewhat differently
onto a page (e.g. two-up, four-up, etc.).
If $source_page_number is 0 or -1, it will return the last page in the document.
Example:
my $pdf = PDF::API2->new();
my $source = PDF::API2->open('source.pdf');
my $page = $pdf->page();
# Import Page 2 from the source PDF
my $object = $pdf->embed_page($source, 2);
# Add it to the new PDF's first page at 1/2 scale
my ($x, $y) = (0, 0);
$page->object($object, $x, $y, 0.5);
$pdf->save('sample.pdf');
Note: You can only import a page from an existing PDF file.
$integer = $pdf->page_count();
Return the number of pages in the document.
$pdf = $pdf->page_labels($page_number, %options);
Describes how pages should be numbered beginning at the specified page number.
# Generate a 30-page PDF
my $pdf = PDF::API2->new();
$pdf->page() for 1..30;
# Number pages i to v, 1 to 20, and A-1 to A-5, respectively
$pdf->page_labels(1, style => 'roman');
$pdf->page_labels(6, style => 'decimal');
$pdf->page_labels(26, style => 'decimal', prefix => 'A-');
$pdf->save('sample.pdf');
The following options are available:
# Set
$pdf->default_page_size($size);
# Get
@rectangle = $pdf->default_page_size()
Set the default physical size for pages in the PDF. If called without
arguments, return the coordinates of the rectangle describing the default
physical page size.
See Page Sizes in the PDF::API2::Page manpage for possible values.
# Set
$pdf->default_page_boundaries(%boundaries);
# Get
%boundaries = $pdf->default_page_boundaries();
Set default prepress page boundaries for pages in the PDF. If called without
arguments, returns the coordinates of the rectangles describing each of the
supported page boundaries.
See the equivalent page_boundaries method in the PDF::API2::Page manpage for details.
my $font = $pdf->font($name, %options)
Add a font to the PDF. Returns the font object, to be used by
the PDF::API2::Content manpage.
The font $name is either the name of one of the standard 14 fonts (e.g. Helvetica) or
the path to a font file.
my $pdf = PDF::API2->new();
my $font1 = $pdf->font('Helvetica-Bold');
my $font2 = $pdf->font('/path/to/ComicSans.ttf');
my $page = $pdf->page();
my $content = $page->text();
$content->position(1 * 72, 9 * 72);
$content->font($font1, 24);
$content->text('Hello, World!');
$content->position(0, -36);
$content->font($font2, 12);
$content->text('This is some sample text.');
$pdf->save('sample.pdf');
The path can be omitted if the font file is in the current directory or one of
the directories returned by font_path .
TrueType (ttf/otf), Adobe PostScript Type 1 (pfa/pfb), and Adobe Glyph Bitmap
Distribution Format (bdf) fonts are supported.
The following %options are available:
- format
The font format is normally detected automatically based on the file's
extension. If you're using a font with an atypical extension, you can set
format to one of truetype (TrueType or OpenType), type1 (PostScript
Type 1), or bitmap (Adobe Bitmap).
- kerning
Kerning (automatic adjustment of space between pairs of characters) is enabled
by default if the font includes this information. Set this option to false to
disable.
- afm_file (PostScript Type 1 fonts only)
Specifies the location of the font metrics file.
- pfm_file (PostScript Type 1 fonts only)
Specifies the location of the printer font metrics file. This option overrides
the -encode option.
- embed (TrueType fonts only)
Fonts are embedded in the PDF by default, which is required to ensure that they
can be viewed properly on a device that doesn't have the font installed. Set
this option to false to prevent the font from being embedded.
$font = $pdf->synthetic_font($base_font, %options)
Create and return a new synthetic font object. See
the PDF::API2::Resource::Font::SynFont manpage for details.
@directories = PDF::API2->font_path()
Return the list of directories that will be searched (in order) in addition to
the current directory when you add a font to a PDF without including the full
path to the font file.
@directories = PDF::API2->add_to_font_path('/my/fonts', '/path/to/fonts');
Add one or more directories to the list of paths to be searched for font files.
Returns the font search path.
@directories = PDF::API2->set_font_path('/my/fonts', '/path/to/fonts');
Replace the existing font search path. This should only be necessary if you
need to remove a directory from the path for some reason, or if you need to
reorder the list.
Returns the font search path.
$object = $pdf->image($file, %options);
Import a supported image type and return an object that can be placed as part of
a page's content:
my $pdf = PDF::API2->new();
my $page = $pdf->page();
my $image = $pdf->image('/path/to/image.jpg');
$page->object($image, 100, 100);
$pdf->save('sample.pdf');
$file may be either a file name, a filehandle, or a the GD::Image manpage object.
See place in the PDF::API2::Content manpage for details about placing images on a page
once they're imported.
The image format is normally detected automatically based on the file's
extension. If passed a filehandle, image formats GIF, JPEG, and PNG will be
detected based on the file's header.
If the file has an atypical extension or the filehandle is for a different kind
of image, you can set the format option to one of the supported types:
gif , jpeg , png , pnm , or tiff .
Note: PNG images that include an alpha (transparency) channel go through a
relatively slow process of splitting the image into separate RGB and alpha
components as is required by images in PDFs. If you're having performance
issues, install PDF::API2::XS or Image::PNG::Libpng to speed this process up by
an order of magnitude; either module will be used automatically if available.
$object = $pdf->barcode($format, $code, %options);
Generate and return a barcode that can be placed as part of a page's content:
my $pdf = PDF::API2->new();
my $page = $pdf->page();
my $barcode = $pdf->barcode('ean13', '0123456789012');
$page->object($barcode, 100, 100);
$pdf->save('sample.pdf');
$format can be one of codabar , code128 , code39 (a.k.a. 3 of 9),
ean128 , ean13 , or itf (a.k.a. interleaved 2 of 5).
$code is the value to be encoded. Start and stop characters are only
required when they're not static (e.g. for Codabar).
The following options are available:
- bar_width
The width of the smallest bar or space in points (72 points = 1 inch).
If you're following a specification that gives bar width in mils (thousandths of
an inch), use this conversion: $points = $mils / 1000 * 72 .
- bar_height
The base height of the barcode in points.
- bar_extend
If present, bars for non-printing characters (e.g. start and stop characters)
will be extended downward by this many points, and printing characters will be
shown below their respective bars.
This is enabled by default for EAN-13 barcodes.
- caption
If present, this value will be printed, centered, beneath the barcode, and
should be a human-readable representation of the barcode.
- font
A font object (created by font) that will be used to print the caption, or
the printable characters when bar_extend is set.
Helvetica will be used by default.
- font_size
The size of the font used for printing the caption or printable characters.
The default will be calculated based on the barcode size, if bar_extend is
set, or 10 otherwise.
- quiet_zone
A margin, in points, that will be place before the left and bottom edges of the
barcode (including the caption, if present). This is used to help barcode
scanners tell where the barcode begins and ends.
The default is the width of one encoded character.
- bar_overflow
Shrinks the horizontal width of bars by this amount in points to account for ink
spread when printing.
The default is 0.01 points.
- color
Draw bars using this color, which may be any value accepted by
fillcolor in the PDF::API2::Content manpage.
The default is black.
$colorspace = $pdf->colorspace($type, @arguments);
Colorspaces can be added to a PDF to either specifically control the output
color on a particular device (spot colors, device colors) or to save space by
limiting the available colors to a defined color palette (web-safe palette, ACT
file).
Once added to the PDF, they can be used in place of regular hex codes or named
colors:
my $pdf = PDF::API2->new();
my $page = $pdf->page();
my $content = $page->graphics();
# Add colorspaces for a spot color and the web-safe color palette
my $spot = $pdf->colorspace('spot', 'PANTONE Red 032 C', '#EF3340');
my $web = $pdf->colorspace('web');
# Fill using the spot color with 100% coverage
$content->fill_color($spot, 1.0);
# Stroke using the first color of the web-safe palette
$content->stroke_color($web, 0);
# Add a rectangle to the page
$content->rectangle(100, 100, 200, 200);
$content->paint();
$pdf->save('sample.pdf');
The following types of colorspaces are supported
- spot
my $spot = $pdf->colorspace('spot', $tint, $alt_color);
Spot colors are used to instruct a device (usually a printer) to use or emulate
a particular ink color ($tint ) for parts of the document. An $alt_color
is provided for devices (e.g. PDF viewers) that don't know how to produce the
named color. It can either be an approximation of the color in RGB, CMYK, or
HSV formats, or a wildly different color (e.g. 100% magenta, %0F00 ) to make
it clear if the spot color isn't being used as expected.
- web
my $web = $pdf->colorspace('web');
The web-safe color palette is a historical collection of colors that was used
when many display devices only supported 256 colors.
- act
my $act = $pdf->colorspace('act', $filename);
An Adobe Color Table (ACT) file provides a custom palette of colors that can be
referenced by PDF graphics and text drawing commands.
- device
my $devicen = $pdf->colorspace('device', @colorspaces);
A device-specific colorspace allows for precise color output on a given device
(typically a printing press), bypassing the normal color interpretation
performed by raster image processors (RIPs).
Device colorspaces are also needed if you want to blend spot colors:
my $pdf = PDF::API2->new();
my $page = $pdf->page();
my $content = $page->graphics();
# Create a two-color device colorspace
my $yellow = $pdf->colorspace('spot', 'Yellow', '%00F0');
my $spot = $pdf->colorspace('spot', 'PANTONE Red 032 C', '#EF3340');
my $device = $pdf->colorspace('device', $yellow, $spot);
# Fill using a blend of 25% yellow and 75% spot color
$content->fill_color($device, 0.25, 0.75);
# Stroke using 100% spot color
$content->stroke_color($device, 0, 1);
# Add a rectangle to the page
$content->rectangle(100, 100, 200, 200);
$content->paint();
$pdf->save('sample.pdf');
$resource = $pdf->egstate();
Creates and returns a new extended graphics state object, described in
the PDF::API2::ExtGState manpage.
Code written using PDF::API2 should continue to work unchanged for the life of
most long-term-stable (LTS) server distributions. Specifically, it should
continue working for versions of Perl that were released within the
past five years (the typical support window for LTS releases) plus six months
(allowing plenty of time for package freezes prior to release).
In PDF::API2, method names, options, and functionality change over time.
Functionality that's documented (not just in source code comments) should
continue working for the same time period of five years and six months, though
deprecation warnings may be added. There may be exceptions if your code happens
to rely on bugs that get fixed, including when a method in PDF::API2 is changed
to more closely follow the PDF specification.
Occasional breaking changes may be unavoidable or deemed small enough in scope
to be worth the benefit of making the change instead of keeping the old
behavior. These will be noted in the Changes file as items beginning with the
phrase ``Breaking Change''.
Undocumented features, unreleased code, features marked as experimental, and
underlying data structures may change at any time. An exception is for features
that were previously released and documented, which should continue to work for
the above time period after the documentation is removed.
Before migrating to a new LTS server version, it's recommended that you upgrade
to the latest version of PDF::API2, use warnings , and check your server logs
for deprecation messages after exercising your code. Once these are resolved,
it should be safe to use future PDF::API2 releases during that LTS support
window.
If your code uses a PDF::API2 method that isn't documented here, it has probably
been deprecated. Search for it in the Migration section below to find its
replacement.
Use this section to bring your existing code up to date with current method
names and options. If you're not getting a deprecation warning, this is
optional, but still recommended.
For example, in cases where a method was simply renamed, the old name will be
set up as an alias for the new one, which can be maintained indefinitely. The
main benefit of switching to the new name is to make it easier to find the
appropriate documentation when you need it.
- new(-compress => 0)
-
- new(-file => $filename)new(-file => $filename)
-
Remove the hyphen from the option names.
new() with any options other than compress or file new() with any options other than compress or file
-
Replace with calls to INTERACTIVE FEATURE METHODS. See the deprecated
preferences method for particular option names.
- finishobjects
-
- saveas
-
- update
-
Replace with save.
- end
-
- release
-
Replace with close.
- open_scalar
-
- openScalar
-
Replace with from_string.
- stringify
-
Replace with to_string.
- info
-
Each of the hash keys now has its own accessor. See METADATA METHODS.
For custom keys or if you prefer to give the key names as variables (e.g. as
part of a loop), use info_metadata.
- infoMetaAttributes
-
Use info_metadata without arguments to get a list of currently-set keys in
the Info dictionary (including any custom keys). This is slightly different
behavior from calling
infoMetaAttributes without arguments, which always
returns the standard key names and any defined custom key names, whether or not
they're present in the PDF.
Calling infoMetaAttributes with arguments defines the list of Info keys that
are supported by the deprecated info method. You can now just call
info_metadata with a standard or custom key and value.
- xmpMetadata
-
Replace with xml_metadata. Note that, when called with an argument,
xml_metadata returns the PDF object rather than the value, to line up with
most other PDF::API2 accessors.
- isEncrypted
-
Replace with is_encrypted.
- outlinesoutlines
-
Replace with outline.
- preferences
-
This functionality has been split into a few methods, aligning more closely with
the underlying PDF structure. See the documentation for each of the methods for
revised option names.
- -fullscreen, -thumbs, -outlines
Call page_mode.
- -singlepage, -onecolumn, -twocolumnleft, -twocolumnright
Call page_layout.
- -hidetoolbar, -hidemenubar, -hidewindowui, -fitwindow, -centerwindow,
-displaytitle, -righttoleft, -afterfullscreenthumbs, -afterfullscreenoutlines,
-printscalingnone, -simplex, -duplexfliplongedge, -duplexflipshortedge
Call viewer_preferences.
- -firstpage
Call open_action.
- openpage
-
Replace with open_page.
- importpage
-
Replace with import_page.
- importPageIntoForm
-
Replace with embed_page.
- pages
-
Replace with page_count.
- pageLabel
-
Replace with page_labels. Remove hyphens from the argument names. Add
style => 'decimal' if there wasn't a -style argument.
- mediabox
-
- cropbox
-
- bleedbox
-
- trimbox
-
- artbox
-
Replace with default_page_boundaries. If using page size aliases
(e.g. ``letter'' or ``A4''), check to ensure that the alias is still supported
(you'll get an error if it isn't).
- synfont
-
Replace with synthetic_font.
- addFontDirs
-
Replace with add_to_font_path.
- corefont
-
Replace with font. Note that
font requires that the font name be an
exact, case-sensitive match. The full list can be found in
STANDARD FONTS in the PDF::API2::Resource::Font::CoreFont manpage.
- ttfont
-
Replace with font. Replace
-noembed => 1 with embed => 0 .
- bdfont
-
Replace with font.
- psfont
-
Replace with font. Rename options
-afmfile and -pfmfile to
afm_file and pfm_file .
Note that Adobe has announced that their products no longer support Postscript
Type 1 fonts, effective early 2023. They recommend using TrueType or OpenType
fonts instead.
- cjkfont
-
- unifont
-
These are old methods from back when Unicode was still new and poorly supported.
Replace them with calls to font using a TrueType or OpenType font that has
the characters you need.
If you're successfully using one of these two methods and feel they shouldn't be
deprecated, please contact me with your use case.
- image_gd
-
- image_gif
-
- image_jpeg
-
- image_png
-
- image_pnm
-
- image_tiff
-
Replace with image.
- xo_code128
-
- xo_codabar
-
- xo_2of5int
-
- xo_3of9
-
- xo_ean13
-
Replace with barcode. Replace arguments as follows:
- colorspace_act
-
- colorspace_web
-
- colorspace_separation
-
- colorspace_devicen
-
Replace with colorspace.
- colorspace_hue
-
This is deprecated because I wasn't able to find a corresponding standard.
Please contact me if you're using it, to avoid having it be removed in a future
release.
- default
-
The optional changes in default behavior have all been deprecated.
Replace pageencaps with calls to save and restore when embedding or
superimposing a page onto another, if needed.
nounrotate and copyannots will continue to work until better options are
available, but should not be used in new code.
PDF::API2 is developed and maintained by Steve Simms, with patches from numerous
contributors who are credited in the Changes file.
It was originally written by Alfred Reibenschuh, extending code written
by Martin Hosken.
This program is free software: you can redistribute it and/or modify it under
the terms of the GNU Lesser General Public License as published by the Free
Software Foundation, either version 2.1 of the License, or (at your option) any
later version.
This library is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
|