Learn more about DjVu
DjVu (pronounced déjà vu) is a computer file format designed primarily to store scanned images, especially those containing text and line drawings. It features advanced technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy compression for bitonal images. This allows for high quality, readable images to be stored in a minimum of space, so that they can be made available on the web.
Progressive loading makes the format ideal for images served over the Internet. DjVu has been promoted as an alternative to PDF, actually outperforming PDF on most scanned documents. The DjVu developers report that color magazine pages compress to 40-70KB, black and white technical papers compress to 15–40KB, and ancient manuscripts compress to around 100KB; all of these are significantly better thant the typical 500KB required for a satisfactory JPEG image. This has led to its widespread use in distributing math books on file sharing networks. Like PDF, DjVu can contain an OCRed text layer, making it easy to perform cut and paste operations.
The DjVu technology was originally developed by Yann Le Cun, Léon Bottou, Patrick Haffner, and Paul G. Howard at AT&T Laboratories in 1996. DjVu is a free file format. The file format specification is published as well as source code for the reference library. The ownership rights to the commercial development of the encoding software have been transferred to different companies over the years, including AT&T and LizardTech. The original authors maintain a GPLed implementation named "DjVuLibre".
DjVu divides a single image into many different images, then compresses them separately. To create a DjVu file, the initial image is first separated into three images: a background image, a foreground image, and a mask image. The background and foreground images are typically lower-resolution color images (e.g., 100dpi); the mask image is a high-resolution bilevel image (e.g., 300dpi) and is typically where the text is stored. The background and foreground images are then compressed using a wavelet-based compression algorithm named IW44. The mask image is compressed using a method called JB2. The JB2 encoding method identifies nearly-identical shapes on the page, such as multiple occurrences of a particular character in a given font, style, and size. It compresses the bitmap of each unique shape separately, and then encodes the locations where each shape appears on the page. Thus, instead of compressing a letter "e" in a given font multiple times, it compresses the letter "e" once (as a compressed bit image) and then records every place on the page it occurs.
DjVu format will be used by the One Laptop per Child project in order to easily supply existing paper books in an eBook format. The advantage of DjVu is that it is highly compressed and it does not require any font support. 
 Comparison of the DjVu and PDF file formats
- The maximal resolution of a DjVu file must be specified at creation time. On the other hand, a vector image represented by a PDF file can usually be magnified at arbitrary resolution without loss of quality.
- DjVu files render characters as images, without using fonts. PDF files usually render characters using fonts. Many PDF files do not embed the full representation of the necessary fonts, but simply specify their names and properties. The PDF viewer uses the exact same font if it is available. Otherwise it transforms an available font to compute an approximation of the desired font.
The PDF format defines various means to store and render raster images. This capability is often used for representing scanned documents. Such PDF files suffer from the same fundamental limitations as raster formats. The size of these files depends dramatically from the underlying compression scheme. Some PDF compression schemes sometimes approach the performance of DjVu. In principle, DjVu compression could be adapted to represent raster images in PDF files. However there is no momentum for creating such a combination because it pleases neither the proponents of DjVu nor those of PDF.
 When to select what format (DjVu or PDF)
With PDF documents one can zoom in on vector-based content to an arbitrary depth or print them at an arbitrarily high resolution without introducing quality loss or jaggedness inherent to raster formats. On the other hand, if a PDF is simply used as a container for non-vector images (such as scans) those images will not gain anything. Another thing to keep in mind is that one can always convert a vector format into a raster format, usually with irrevocable data loss, but the other direction is very difficult.
PDF is most useful when the original source is an electronic document such as a Microsoft Word doc or TeX file. Such documents benefit most from the vector graphics technology that underlies PDF. DjVu files can be marginally smaller but only deliver a high resolution image, possibly enriched with the associated text.
DjVu is very good for image files, and has especially been optimized for scanned text and images. If one has a set of scanned pages from a book or article, DjVu is superior to PDF. However, PDF could be better if the scanned raster images can be transformed into high quality vector graphics, for instance by applying Optical Character Recognition to the scanned image, identifying the fonts, and carefully proofreading the resulting file. This procedure is often undesirable or time/cost prohibitive. Suitable fonts might not be available. One may want to preserve the original document exactly, including signatures, marginal comments, or other markings. Or perhaps one wants to reproduce the original handwriting or properties of the paper. For example, you have scanned some old scriptures and hand-code the text. In such cases, the DjVu is the good choice.
 External links
- Realview Technologies, Magazine and Newspaper Digital editions provider utilising DjVu
- ATT patent 6058214 (1999)
- DjVuZone.org, non-commercial resource about DjVu
- Creating DjVu from almost any format online
- DjVuLibre, open source DjVu viewer, browser plug-in, and tools
- LizardTech, DjVu Browser Plug-in, free, proprietary viewer
- LizardTech, Technical papers on DjVu
- High Quality Document Image Compression with DjVu (434KB), (ps.gz, 1.9MB)
- Bottou98 citations (Journal of Electronic Imaging, vol. 7, no. 3)
- MIME image/vnd.djvu (IANA registration, 2002)
- WinDjView & MacDjView (open source)
- Evince open source Linux viewer for DjVu, PDF, PS, TIFF and DVI.
- DjView for Qtopia (open source, for Zaurus)
- A DjVu viewer for Symbian OS. (Freeware)
- Facsimile Books & other digitally enhanced Works from: THE UNIVERSITY OF GEORGIA LIBRARIES (searchable DjVu format)
- DjVu vs PDF comparison / challenge published by Planet DjVu
- List of DjVu resources
- DjVu ebooks
- DjVu on Wiki*ediaar:Djvu