Creating PDFs
Trying to shove tables
into a sheet of paper since 88'
Scott Steinbeck
- Software Developer
- 20+ Year of experience
- Father
- Hardware Tinkerer
Hobbies
- Coding on my free time
- Overcommitting myself
- Automating everything
- IOT Development
- Teaching Coding/Electronics
- Contributing to open source
Creating Excel
@cfsimplicity Julian Halliwell
Creating Excel
Back to the show
State of the cfdocument
if only they had more experience with pdfs...
Reality...
Modern CSS support
Limited Header/Footer Customization
JS Support
Page Break Issues
Table Border Styles
Custom Font Support
Context aware chapter titles
Page numbering
Bookmarks
Table of Contents
Internal Links
Image Paths
Repeating Table Headers
Orientation change within the PDF
Missing, limited, or undocumented support for...
alternative solutions that provide additional features
WKHTMLTOPDF
executable
alternative solutions that provide additional features
PhantomJS (headless browser)
PhantomJS development is suspended until further notice
executable
alternative solutions that provide additional features
executable -> Puppeteer CLI
alternative solutions that provide additional features
Written for node, wrapped for java
Playwright
alternative solutions that provide additional features
https://html2canvas.hertzen.com/
Javascript library that renders html into a canvas element on the fly and convert it to an image
alternative solutions that provide additional features
Notable mentions
Apache PDF Box
(not from ortus)
Prince PDF
($500 / server licence)
Introducing FLying Saucer
Lucee 5.3+
Introducing FLying Saucer
Bryce (1999)
Searching for docs....
HTML/CSS SUpport
The Adobe CFML cfdocument tag can render HTML that supports the following standards:
HTML 4.01
XML 1.0
DOM Level 1 and 2
CSS1 and CSS2 (For more information, see the "Supported CSS styles" section).
The Lucee CFML cfdocument tag can render HTML that supports the following standards:
HTML 4.01
XML 1.0
DOM Level 1 and 2
CSS 2.1, css 3 support in development
The magic is found in the CSS support
Orientation change within the PDF
/* Use A4 paper */
@page { size: A4 }
/* Use A4 paper in landscape orientation */
@page { size: A4 landscape }
/* These two custom sizes are equivalent */
@page { size: 30cm 40cm }
@page { size: 40cm 30cm landscape }
/* Use square paper, this sets width and height */
@page { size: 30cm }
Header/Footer Customization
Name | Default alignment | In figure | |
---|---|---|---|
text-align | vertical-align | ||
@top | center | middle | yellow |
@bottom | center | middle | yellow |
@left | center | middle | red |
@right | center | middle | red |
@top-left | left | middle | green |
@page {
@bottom {
content: counter(page)
}
}
Page Break (inside, after, before)
page-break-before:
auto | always | avoid
Page Break (inside, after, before)
page-break-after:
auto | always | avoid
Page Break (inside, after, before)
page-break-inside:
auto | always | avoid
Table Styles
table {
border-collapse: collapse;
table-layout: fixed;
}
modifies the table layout algorithm to repeat table headers and footers on subsequent pages
Custom Font Support
<cfdocument fontDirectory = "path/to/Calibri.ttf">
<style type="text/css">
body {
font-family: Calibri, sans-serif;
}
</style>
</cfdocument>
Note: Modern (Flying saucer) engine works using the font-family-name from the .ttf file with the same case.
Note 2: Possible problems with forign language fonts
Context aware chapter titles
@page {
@top {
content: string(doctitle)
}
}
h1 { string-set: doctitle content() }
Copying content from the document
Generated content in page regions may contain text content copied from the document using the string-set property:
Page numbering
@page docpage{
size:8.5in 11in;
margin: .4in .5in .75in .5in;
@bottom-left {
font-family: Arial, Helvetica, sans-serif;
font-size:11px;
content: "<cfoutput>#prc.invoiceTitle#</cfoutput>"
}
@bottom-right {
font-family: Arial, Helvetica, sans-serif;
font-size:11px;
content: "Page " counter(page) " of " counter(pages);
}
}
.normalpage {
page: docpage;
}
Bookmarks
document bookmark=true {}
<a href="#anchor-id">What is ColdFusion (CFML)</a>
<h2 id="#anchor-id" class="h1 page">
What is ColdFusion (CFML)
</h2>
Table of Contents
ul.toc a::after {
content: leader(".") target-counter(attr(href), page);
}
Image Paths
document localurl=true
<img
align="center"
style="max-width:100%"
src="/path/to/local_file.png"
/>
<img
align="center"
style="max-width:100%"
src="http(s)://yourimageurl.png"
/>
ColdFusion retrieves image files directly from
the local drive rather than by using HTTP, HTTPS, or proxy
Repeating Table Headers
table {
-fs-table-paginate
}
modifies the table layout algorithm to repeat table headers and footers on subsequent pages
More custom Flying Saucer Tags
-fs-keep-with-inline | will try to avoid breaking a block in such a way that only padding and borders appear on a page. |
-fs-page-sequence | Allows you to limit the scope of the page and pages counters to a portion of the document |
-fs-font-metric-src | use inside a font-face rule in case the font you want to embed in the PDF has a custom font metrics file; |
-fs-pdf-font-embed | use with the value embed inside a font-face rule to have Flying Saucer embed a font file within a PDF document, avoiding the need to call the addFont() method of the FontResolver class |
-fs-pdf-font-encoding | use inside a font-face rule to specify the enconding for a custom font you are embedding inside a PDF; takes the name of the encoding as value. |
-fs-table-cell-colspan | whole number. Replaces use of legacy colspan attribute for table columns. |
-fs-table-cell-rowspan | whole number. Replaces use of legacy rowspan attribute for table columns. |
-fs-table-paginate | when used with the value paginate, modifies the table layout algorithm to repeat table headers and footers on subsequent pages and improve the appearance of cells that break across pages (for example by closing and reopening borders), but that's all it does. If a table's minimum width is wider than the page, it will be chopped off. |
-fs-text-decoration-extent | Either line (default) or block. It controls how text decorations are drawn on a block level element. With line, the spec compliant behavior is used text decoration is drawn across line box. With block, text decoration is drawn across entire content area of block. |
https://flyingsaucerproject.github.io/flyingsaucer/r8/guide/users-guide-R8.html
BONUS CODE HIGHLIGHTING
Putting it all together
New Challenge
Original Export
New Export
how do we convert it?
AST
Abstract Syntax Tree
Commandbox JQ (Jmes path)
AST Explorer
https://astexplorer.net/
what do we do with Markdown?
Search for a java library to convert it
moduleSettings = {
cbmarkdown = {
// Looks for www or emails and converts them to links
autoLinkUrls : true,
// Creates anchor links for headings
anchorLinks : true,
// Set the anchor id
anchorSetId : true,
// Set the anchor id but also the name
achorSetName : true,
// Do we create the anchor for the full header or just before it. True is wrap, false is just create anchor tag
anchorWrapText : false,
// The class(es) to apply to the anchor
anchorClass : "anchor",
// raw html prefix. Added before heading text, wrapped or unwrapped
anchorPrefix : "",
// raw html suffix. Added before heading text, wrapped or unwrapped
anchorSuffix : "",
// Enable youtube embedded link transformer
enableYouTubeTransformer : false,
// override HTML to use for wrapping style.
codeStyleHTMLOpen : '<code class="code inline">',
// override HTML to use for wrapping style.
codeStyleHTMLClose : '</code>',
// add a class prefix to the "fenced" code blocks, i.e. ```js. Useful for supporting various syntax highlighters.
fencedCodeLanguageClassPrefix : "brush",
// Table options
tableOptions : {
// Treat consecutive pipes at the end of a column as defining spanning column.
columnSpans : true,
// Whether table body columns should be at least the number or header columns.
appendMissingColumns : true,
// Whether to discard body columns that are beyond what is defined in the header
discardExtraColumns : true,
// Class name to use on tables
className : "table",
// When true only tables whose header lines contain the same number of columns as the separator line will be recognized
headerSeparationColumnMatch : true
}
} // end markdown settings
};
is all parsing done with ast?
vscode-cfml
{
"name": "CFML",
"scopeName": "embedding.cfml",
"patterns": [
{
"begin": "(?i)(?=^\\s*(/\\*|//|import\\b|(component|abstract\\s*component|final\\s*component|interface)(\\s+|{)))",
"contentName": "source.cfml.script",
"end": "(?=not)possible",
"patterns": [
{
"include": "#source-cfml-script"
}
]
},
{
"include": "#source-cfml-tag-comments"
},
{
"begin": "(?i)(?=<cf(component|interface)\\b)",
"contentName": "source.cfml",
"end": "(?=not)possible",
"patterns": [
{
"include": "#cfcomponent"
},
{
"include": "#cfinterface"
}
]
},
{
"begin": "(?=\\S)",
"contentName": "text.html.cfml",
is all parsing done with ast?
cfml-parser
var braceOpen = 0;
var semi = 0;
var quotePos = 0;
var eqPos = 0;
var lineEnd = 0;
var inString = false;
var stringOpenChar = "";
var currentStatement = "";
var currentStatementStart = 1;
var commentStatement = "";
var sb = createObject("java", "java.lang.StringBuilder");
//parsing a cfscript tag uses startPosition and endPosition
if (arguments.startPosition != 0 && arguments.endPosition != 0) {
pos = arguments.startPosition;
contentLength = arguments.endPosition;
}
while(pos<=contentLength) {
c = mid(content, pos, 1);
if (c == "'" || c == """") {
if (inString && stringOpenChar == c) {
if (mid(content, pos, 2) != c&c) {
inString = false; //end string
} else {
//escaped string open char
sb.append(c);
sb.append(c);
pos = pos+2;
continue;
}
} else if (!inString) {
inString = true;
stringOpenChar = c;
}
sb.append(c);
} else if (!inString) {
if (c == "/" && mid(content, pos, 2) == "/*") {
//currentState = this.STATE.COMMENT;
commentStatement = new Comment(name="/*", startPosition=pos, parent=parent, file=arguments.file);
if (!isSimpleValue(parent)) {
parent.addChild(commentStatement);
}
endPos = find("*/", content, pos+3);
if (endPos == 0) {
//end of doc
endPos = contentLength;
}
commentStatement.setEndPosition(endPos);~
Advanced PDF Generation + Building a gitbook Markdown conversion process
By uniquetrio2000
Advanced PDF Generation + Building a gitbook Markdown conversion process
Take a dive deep into the undocumented features of Flying Saucer (the engine behind cfDocument) in pursuit of creating a pdf with page numbering, context-aware chapter headings, table page breaks, and more. Then we will spend some time digging into markdown conversion using a java library Flexmark to handle many aspects of markdown conversions including tables, auto-linking, embedded YouTube, etc.
- 581