PDF Blog Contents Summary
This PDF Blog page contains extracts from our regular series of PDF Security Newsletters, starting with articles issued from 2022
- Delivering accessible online publications for Engineering students in schools and colleges
- Applications of Artificial Intelligence (AI) to Publishing and Training Services
- Avoiding PDF malware and phishing
- Should we all be using a small footprint secure PDF reader? Avoiding PDF malware and phishing
- PDF Knowledgebase
- Secure electronic publishing - ePUB or PDF or Web?
- Trainers, Academics and Engineers adopt cost-effective electronic self-publishing solutions
- Changing pages in PDF files (Page Insert, Delete and Extract)
- Running iOS apps on Mac computers
- Accessibility, Language and Text to Speech
- Major new release of Javelin3 secure PDF reader for MS Windows
- Personalization of PDFs
- Issuing PDFs that Expire
- PDF and secure PDF Questions and Answers: see our FAQs page
Avoiding PDF malware and phishing
Over recent years a wide range of concerns about the security of "standard" PDFs have been raised. Literally billions of PDF files are created and distributed every year and almost all are safe to view and use. However, a significant number of PDF files contain direct and indirect security concerns, so in this article we look at the main ones circulating today and how to avoid being caught out. Click here or on our Infographic to read the full article
In addition to this "PDF Security blog" page, our separate PDF Wiki Knowledgebase resource is designed to make information about PDF files and their usage readily available in a clear, user-friendly manner. There is also an emphasis on issues of PDF security in the context of content and copyright protection, whether for private individuals, businesses or governmental organizations. The knowledgebase also provides details on all Drumlin, Javelin and Webdoxx software and services.
Secure electronic publishing - ePUB or PDF or Web?
Overview: If publications are designed to be read from cover to cover, and contain little or no formatting or images, and color is not important, then ePUB is probably the best option. This is the usual choice for novels and similar books. However, if the publications contain anything more complicated, or were originally designed for print publication, or as training resource materials, then PDF may be a better choice. In either case, security of copyrighted content is paramount for most publishers.
For open-access publication of Magazines, Catalogs and Comics, with optional access control (e.g., for subscription services), then web-based may be the best way to go. Journal publications, newspapers and newsletters are also often best handled via web-based display, providing maximum reach with minimal support. Controls over copying, downloading and printing are all standard features that provide basic content security.
PDFs: One of the most important features of PDF documents is that they are defined by a page-based model - this describes how individual pages in the document are made up, in terms of the text, the fonts used, graphical objects, interactive elements and possibly other features associated with the page. This page-based model means that when you look at a PDF page on-screen or on printed output, it should always look the same and as specified by the designer.
Page-based ePUB3 and HTML5: Although standard ePUB and HTML are designed for flowable text (see further, below), the more recent variants can be used to present page-based information. These features enable fixed layouts to be used, often with color images and in some cases, with interactive and multimedia elements. This ideal for graphic novels, comics, magazines and books with a lot of color content (e.g., photographs, technical drawings etc.).
Flowable ePUB and HTML: Standard ePUB and HTML formats are effectively a linear stream of items, one after another, with limited "layout" elements. A major aim of the ePUB format is to allow the text to be the dominant element, re-sizable and re-flowable, ignoring the page concept and focusing on the size and orientation of the device on which it is viewed. ePUB, and its variants and versions, is the most widely used format for reading eBooks and similar documents on mobile devices, including of course Amazon Kindle, Nook and other specialized ebook reader devices.
As an indication of what this means for an existing, well-structured and formatted source document (e.g., created in Word or InDesign) see this article by Reka Oroszi. In the article the author explains that for standard ePUB documents to work properly you need to start by:
1. Removing all text formatting (styles); then
2. Remove all numbered lists, bulletized lists, page numbers, tables, tables of contents, forced line breaks, double spaces, double paragraphs, page breaks, tabs, image wrapping, footnotes and direct links to third party webstores!
Only then can you re-introduce a basic level of formatting - ideal for works of fiction and publications with very limited format requirements, but not for more complicated documents. For the latter, PDF format with Digital Rights Management (DRM) protection, or a matching web-based (HTML5) implementation is what is needed.
Our offline and web-based services support secure document publishing of both ePUB and PDF source documents. In addition, we can assist in creating suitable files from sources in other formats (e.g., Word, Powerpoint, InDesign etc.) and can advise on print-production and distribution if that forms part of the requirement.
Changing pages in PDF files
PDF modifications: In general, special PDF editing software is required for deleting pages, inserting pages and extracting pages. These facilities are now provided as standard in the latest version of our free Javelin3 for Windows PDF reader.
To use these new facilities, open any standard PDF in Javelin3 for Windows (latest versions) and right-click on a page to see the new options. These include: Delete ... , for deleting the current page or a selection of pages; Insert PDF document, enabling an entire PDF document from disk to be inserted into the current document AFTER the current page; and Extract, where one or more pages from the current PDF document can be extracted to a new, separate PDF file. After making any changes to the current PDF document a prompt is given when leaving the document that changes should be saved or discarded. If discarded the source PDF remains unchanged.
Running iOS apps on Mac computers
"iPhone apps and iPad apps are available without modification on the Mac App Store on Apple silicon Macs. These apps can be optimized to work with keyboards, windows, and touch-input gestures by using existing capabilities that are already available to iPhone and iPad apps." (Apple Inc, 2023)
However, when such apps are run in this way they lack some of the security features that real iPAD and iPhone devices provide. Javelin secure PDF reader for iOS and apps based on Javelin identify the specific device on which access to secured documents are permitted using its deviceID. The deviceID provided by Apple in this case is a 'virtual' deviceID, not the unique physical ID of the device. This is believed to have been introduced some years ago for US government security reasons, and so as long as the virtual deviceID does not change, Javelin readers will retain the details of documents and their authorization status enabled on that device. However, this is not the case for emulations of iOS devices, such as the recent Apple MacOS M1 chipset devices, on which the virtual deviceIDs change each time they are run, so these do not have the necessary security framework for the encrypted PDFs. The same is true if the iOS operating system is replaced following a fault with the device. For this reason users will not now find Javelin and related apps on the MacOS AppStore, but only on the main Apple iOS AppStore.
Accessibility, Language and Text to Speech (TTS)
Overview: One of the most effective ways of providing accessibility to text-based documents is to enable them for text-to-speech (TTS) technology. Also known as "Speech Synthesis" TTS enables selected blocks of text or entire pages to be read aloud in a meaningful manner. Typically, this uses built-in features of the operating system (OS) for the device in question, and these vary in quality and functionality.
The "voice" used for TTS on different operating systems is selectable, not only by type or name but also by language. For example, reading aloud a Spanish textbook using an English voice not only sounds wrong, but it can also be impossible to understand! The solution is to ensure the language pack for the preferred language is installed on the target device and then selected as the default language and voice for TTS.
A summary of the facilities for each main end-user OS is provided below:
macOS v13 and later - The Settings (System Preferences) facility, Accessibility/Speech option is where you will find the System Voice selector, for example "Jorge" for a Spanish, male voice speaker. Use the Customize option to select a language and speaker that is not present in the default list. Read aloud is enabled via a key sequence, e.g., Option+Esc (the Option key is often labelled "Alt"). In Javelin3 for macOS, select a block of text and press the key sequence for the text to be read aloud in the chosen default voice
Android v9 and later - on Android devices the Settings facility, General Management/Language and Input/Speech section is where TTS is defined. Select the language to use, touch the PLAY button to check the setting, and then exit. Then run Javelin3 for Android, open the document to be read aloud, and either select text for highlighting/reading aloud from the reader toolbar (highlight icon), or use the 3-dot icon on the toolbar to select the Read Page option. The text will be read using the voice just selected. The voice used may revert to the device regional settings voice next time the Javelin3 app is used, so if this occurs, simply repeat the select and PLAY check described above and it will spring into life again!
Windows 10 and 11 - the Windows Settings facility, Language option is where this is defined. Once the default language for speech is defined, selected text or pages will be spoken with the chosen voice, e.g., "Helena". If in doubt, check the Microsoft Help facilities. In Javelin3 for Windows, select a block of text and right-click to see the read aloud options (Read Selected Text by ..., Read Selected Text, or Read page) and it will be spoken with your selected voice. If a block of text is not selected, then only the Read Page option will be available. Read aloud ends when the chosen text or page end has been reached or via the right-click option, "Stop speaking". The Tools menu, Settings options allows the default voice to be selected. Owing to a bug in current versions of Windows 10 and 11, only a subset of available voices for a selected language may be chosen
iOS - as with macOS, TTS is enabled via the Settings facility, Accessibility/Spoken Content options. Selection of text to be read aloud is essential for TTS to work as expected on iOS devices. Currently Javelin for iOS does not support this functionality, but it may be enabled in the future
WEB - Several web-based services use text-to-speech for audiobook provision, although professional human-read audiobooks work much better, particularly for longer non-technical texts and works of fiction
Javelin3 secure PDF reader - Major new release, v3.1
In response to a number of requests, Javelin3 for Windows has been revamped, with many enhancements, as illustrated and summarized below:
- Multi-language speech synthesis support (see above)
- Revised Home page display, with modern layout - as illustrated
- New direct Home page links to the User Guide, Help facilities and Web Resources
- New option to immediately open the last document viewed, rather than the Home page, speeding up access to documents
- Page window size and display settings, and the last page viewed, are retained even if the program is exit-ed abnormally rather than after closing a document or using the File menu, Quit option
- Menu display issue when Windows Scaling not 100% now resolved (note that scaling is not recommended as it can result in poorer quality display of text and graphics)
- White Label option extended - ask us for more details
- Updated documentation and video demos
- Page modification: Insert PDF file, Delete pages and Extract PDF pages
Personalization of PDFs
Sometimes it is desirable to send someone a PDF document that is personalized in order to make it clear that it is specific to that individual, to discourage or prevent its use by others. One way to do this is to edit the source PDF so that every page includes information about the user (e.g. Full name, their email address, their organizational affiliation etc.) as a form of "Stamp" or Watermark on every page (see further, our PDF Wiki article on this topic). In addition, the PDF filename can be changed to include the user's name or other user-specific identification. These changes can be applied to any PDF, with added standard PDF security or stronger DRM-enforced security. Facilities to provide such functionality are included as standard in our DrumlinPublisher software. All existing subscribing publishes can now use this facility.
Issuing PDFs that Expire
The PDF standard does not include facilities to enable a PDF to be issued that will no longer be readable after a specified date. However, there are three ways in which a PDF can be expired on a specific date and/or time:
- using a Digital Rights Management (DRM) service. This type of service does provide for reliable and secure date expiry, but typically requires subscription to a suitable DRM service and use of a proprietary PDF reader. Our Drumlin DRM service and Javelin PDF readers provide this kind of facility as standard
- using a web-based secure PDF display service (like our Webdoxx service) that protects against PDF downloading or provides the PDF content in a form that looks exactly like the source PDF but the PDF itself is not hosted on the server used. In this case the PDF can be expired automatically or simply removed from the server. With this approach the end user does not require any special software to view it - just an HTML5 compliant web browser