PDF Security - the Essentials for Electronic Publishing
PDF Security is a complex topic, with three main strands: (i) Authentication; (ii) Content protection; and (iii) Digital Rights Management. In this section we discuss each briefly, but there is a vast amount of information available on all these topics. Copyright protection of PDFs requires use of a Digital Rights Management (DRM) service.
Authentication is the process of determining whether a document is from the person or organization it claims to be from and/or is correctly signed by them. The basic ideas are summarized by Adobe as follows (with Digital Rights Management systems the issue of authentication can largely be ignored, as the source and distribution of secured files is centrally managed and controlled):
PDF supports two kinds of digital signatures: approval signatures and certification signatures. Any number of approval signatures may be applied to a PDF document but only one certifying signature may be applied and it must be the first digital signature. Approval signatures are used in the same manner as the ink on paper signatures we are all familiar with. Certification signatures are considered a part of creating the PDF file so only occur once at the beginning.
The screenshot below provides an example of the use of Digital Signatures with the Adobe software - the provision of such information regarding signatures is not implemented in all PDF readers, so for such files use of Adobe's reader is recommended. Also notice that the first signature shown here is recorded as being by DocuSign, i.e. via a third party document signing service.
Digital Certificates (certification) is slightly different from applying signatures - it involves use of an independent certifications authority. Adobe approves the following certifications authorities (Entrust, GlobalSign, OpenTrust and Symantec/Verisign). The screen shot below illsutrates the prompt provided when using the signing tool in Adobe's Certificate Signing option:
Content protection is the most familiar option for users of PDFs. It provides for two forms of document content protection. The first is to apply a password restriction such that when a user attempts to open the document a password is requested before the content is displayed. One reason for providing such a facility in the early days was to try and limit access to PDF files on PCs that were shared or left unattended. In large measure this facility is now redundant and is not recommended for use.
The principal form of content protection provided within current standards and Adobe's implementations involves encryption of the PDF file using a user-supplied password. The password, known only to the person securing the file, is entered and selected features of PDF are then made unavailable to a person to whom the file is sent. A sample screenshot of the Adobe Acrobat security settings screen is shown below. The upper entry is for the Open password protection described above, whilst the lower section specifies the security password and permission settings required. As can be seen the default setting is to not allow copying of the text or printing. HOWEVER, even if you set these controls they may be ignored is the resulting PDF file is opened with a non-Adobe PDF reader that ignores such settings.
The other big issue with applying such security is trying to determine whether the file can be readily decrypted. The brief answer to this is "yes", assuming the encryption password/key is not too long and complicated and the encryption level is at least 128bitAES or 256bit AES - for more details on decryption of secured PDF files see the website of the Russian software house, Elcomsoft. The 256bit encryption level for Adobe PDFs is not available for standard PDFs (it requires Adobe Reader 9.0 or later), so should only be used if the target PDF readers are ALL the Adobe Reader.
Content protection is possible, including print, copy, forward, save and date-based protection, using non-Adobe solutions. In most cases these facilities require alternative PDF readers - these are discussed below in the section covering Digital Rights Management
Many PDF documents include a copyright statement at the start of the document and/or in running footers. In addition Static and Dynamic Watermarks may be added to provide a level of document security. With Adobe-style "static" watermarks many tools exist for watermark removal, so this kind of watermark only provides additional security if included in a DRM protected encrypted PDF (see further below) where the document cannot be decrypted. Dynamic watermarks are displayed when the document is viewed and use variables to identify things like the username, their device and the date and time. In this case the security provided is strong because the watermark cannot be removed.
Static watermarks are created by using a tool such as Adobe Acrobat or A-PDF to create an additional foreground or background item on some or all pages of a PDF. The screenshot below illustrates this using Adobe Acrobat XI, and as can be seen, the text entered appears on the selected page – here as a simple text string, in the color, size and orientation selected. Adobe Acrobat and A-PDF are very flexible in providing options for statically added watermarks of this type. Once added the PDF must be saved, so that the additional content form a permanent part of the PDF file itself. If this file is then protected, using Adobe’s standard security facilities, preferably with a longish password (8+ alpha-numerics) then the watermarking will have a reasonable level of protection against removal. Note that the watermark can contain any statically defined information you wish, so can be generic, e.g. “(c) My Company, 2019”, or “!This file has been issued to Mr A B Johnson of XYZ Inc – no copying of this file is permitted”. Our DrumlinPublisher software supports personalized static watermarks of this kind within a fully encrypted DRM environment (see further, below)
The second approach is the use of dynamic watermarking. As with static watermarking a dynamic watermark can contain static text, such as “(c) My Company 2019”, but that misses the real value of such facilities. The main feature of a dynamic watermark is that it includes information generated at the moment of display or printing that includes (additional) end user or other information that makes the file “unique” and identifiable.
In the example below there are both static and dynamic watermarks included. The static watermark has been added to the source document using Adobe Acrobat – in this case it has been placed at a diagonal angle across the text in such as way as to extend across the page but with minimal interference with the text. The dynamic watermark, created using DrumlinPublisher, is shown at the foot of the page, and includes information about the file displayed (the filename itself), plus information that identifies the user (via the partially displayed code), the device on which the document is displayed, the date and other information. This information is dynamically generated when the file is displayed and is overlaid onto this viewable screen window rather than embedded in the document. This means that when the page is zoomed in or out, the dynamic watermark is always displayed in front of the viewable area. This provides an added level of protection for the document against screen capture that is now a standard feature of many operating systems and hardware devices (e.g. a built in feature of Android and iOS mobile devices). There are also a range of server-driven tools and PDF security products that will automatically stamp or watermark PDF files that are downloaded.
Digital Rights Management
Digital Rights Management (or DRM for short) is the term applied to the protection of digital assets using a centralized rights management service. It uses the combination of several distinct elements to provide the strongest possible protection of PDF documents, i.e. protection against copyright theft, amendment, forwarding files and much more. DRM services not only protect documents at the point of viewing, but also provide facilities to track access and in some instances, withdraw access permission from the end user.
DRM systems fall broadly into two main classes: (i) Hardware-based solutions, which rely on identification of pre-registered hardware, typically as an eBook reader (e.g.a Kindle) or controlled generic device (e.g. an Apple iPAD) in order to veryify the end user, their access and permissioning; and (ii) Software-based solutions, which apply across technology platforms, i.e. that are not based on proprietary hardware but rely on the exchange of information between the user and user's device and the central DRM service to uniquely identify the identity of the target receipient of the secured PDF file. Examples include our own "Offline" PDF security service. Note that for PDF files, unlike ePUB or similar files, the hardware-based ebook vendors like Amazon, do not offer a DRM service, so use of third party software and service solutions are required for PDF DRM. This includes Adobe of course (at a high cost) and a number of other providers, including ourselves. A number of these providers offer cross-platform solutions, whilst others offer solutions that are both cross platform and cross document type. Because hardware-based DRM services do not support PDF files, software solutions are required. In general this involves using a special PDF reader, e.g. Digital Editions from Adobe or Javelin from Drumlin Security, to open a specially encrypted PDF file.
Software DRM solutions can also be separated into two distinct types - those that are based on use of some form of code string for authorization of access to a specific document, and those based on license files and/or online access by pre-registered users. In the former case there is no requirement for users to be registered, so no user management system is imposed on the implementation and management of the service. In the latter case all users must be registered with the central DRM service before they can be enabled to view specific documents. This latter requirement has the advantage that it provides a high level of access control, with the option to disable a document and/or user under specific circumstances. However, it has the disadvantage that the entire system has to be managed, which can impose a substantial overhead on organizations. For this reason it is best applied in cases where PDF documents are distributed intra-corporately, although it can be applied for extra-corporate PDF distribution when applied carefully (e.g. for well-defined closed user groups). It is not really suitable for eBook sales and similar low overhead/low margin applications, not for smaller organizations where the cost of managing such a service and possibly providing DRM service integration, can be high.
Some PDF DRM providers, including ourselves, offer both options, i.e. code-based and license based. The standard service is code based, this being inexpensive and very quick and simple to implement, ideally suited to small-medium sized organizations and for ecommerce applications. For larger organizations, with more complex requirements, the license-based approach may be more suitable.
Before ending this section a rather different approach to PDF security is possible, delivery of content via a web browser, with or without user access controls (i.e. user login with tracking of this activity). With this model users access the PDF via a standard web browser (typically an HTML5 compliant browser, which most are nowadays) and view the pages online rather than offline. The advantages of this approach are that (i) no special software download and install is required; (ii) no document distribution is required; and (iii) the service is highly scalable with minimal management support required. The disadvantages are that the document must be read online, which means continuous access online is needed; the quality of display, speed of display, and overall functionality is typically not as good as offline usage; and security is much lower because the software being used to view the files is just a web browser over which the service provider has no control. Solutions of this type can be based on PDF display, Flash-based display (not recommended as Flash is not supported on many devices), HTML5 display (can be fast and very good quality if statically generated, slower if dynamically generated), or pure HTML display (effective but poorer quality). For general information about our Online (HTML5-based) PDF security services, please see our Online page