ZIP format

Introduction

ZIP file format is a data compression and file storage file format, formerly known Deflate, as the inventor Phil Katz (Phil Katz ), he published the data in this format in January 1989. ZIP typically use the suffix ".zip", its MIME format application / zip. Currently, ZIP compression format is one of several mainstream formats, including its competitors 7z format RAR formats and open source. From the comparison of the performance, RAR and 7z format higher than the ZIP format compression ratio, while 7-Zip compression due to the provision of free tools and gradually applied in more fields. Microsoft from Windows ME operating system begins to built-in support for zip format, even if not installed decompression software on the user's computer, can be opened and compressed files produced zip format, OS X and popular Linux operating system on the zip format provides similar support. So if the spread and distribution of documents on the network, zip format is often the most common choice.

history

formerly

1985 Nian company called SEA (System Enhancement Associates, system enhancement partner) of a small company developed the MS-DOS platform commercial compression software, known as ARC. At the time of software release is now slightly different way, the user buy the software, in addition to get the software executable file also includes a C language source code. Katz was a computer and a lot of civilians, like the lack of funds to purchase a large number of commercial software, download Katz was an ARC of C language source code from the Internet, and its new assembly language to write and compile it. Katz this software called: PKARC (Phillip Katz 'ARC). Katz made the new software PKARC because it is written in assembly language again, because it is a reference to the source code is written, it is fully compatible with the ARC and the performance is higher than the ARC. Katz was this new software is uploaded to the network above. Obviously, the move caused Katz infringement of the SEA company. SEA initially contacted Katz hopes to become the PKARC SEA's a product company, and later Katz refused. Eventually, the two sides in court, the result is lost Katz, Katz was sentenced to the company's claims of SEA and to stop issuing PKARC. Later, Katz PKARC sequel in the development process are also forced to rewrite all the code, PKARC fact, PKZIP's predecessor mentioned below, but Phil Katz did not earn a penny from PKARC, or destitute, and because alcoholism and many other reasons, died in 2000 in a motel.

birth

After a few weeks the lawsuit, Katz would produce a new compression software PKZIP (Phillip Katz 'ZIP), this new software compression ratio than the ARC compression performance should be much higher, and contains more features. Since then, Katz will also be made public ZIP all technical parameters. ZIP (meaning "speed") is the name of Katz's friend Robert Ma Hongli (Robert Mahoney) recommendations. They want to imply that their products faster than ARC within a certain time. The name is often written in capital letters, because in DOS systems, often using capital letters as the extension. (Due to run on MS-DOS FAT file system)

Winzip advent

Windows 3.0 before launch, two equally popular ZIP format, is a LHA (LHArc), another is the ARJ (Archiver Robert Jung), until 1995, these three compression formats are mainstream PC applications. After 1995, Microsoft released Windows95, was transferred from DOS to Windows users who, under extreme thirst for excellent graphical interface software, Winzip its excellent performance and is not shy of a graphical user interface to attract a user's attention, at the time possession of a large amount of share of the market, in fact, it was the WinZip is just a GUI shell resources DOS calls, but from Windows 3 began production experience and performance make GUI look better than when popular software. Soon, WinZip became at that time a very popular software. But also led to the spread of ZIP due to too popular in the early, late Shidao many users think that WinZip is created ZIP, this is actually a misunderstanding, more detailed information about WinZip, see WinZip.

Development

because open formats and free. More and more software embedded support for opening Zip files. In this case, Zip files compressed more and more like a transparent folder.

  • Since Windows Me, Windows Embedded support open and the compressed Zip file ZIP format

  • Some tools download software, support partial downloads Zip file and then restore.

  • More and more software embedded support for opening Zip files.

  • Almost all of the compression software and production support open Zip files.

crisis

Basically, the development of Zip files are driven by PKware company with Winzip. However, the two companies on certain issues of mutual suspicion, leading to slow development. People most want to achieve the target in the Zip file, it is to strengthen the capacity of the current encrypted Zip files. For now, Zip file encryption weak and pitiful, just a mere password protection, simply can not meet the security requirements. Although Katz discloses the format alive, but time was made to leave space for future upgrades. And Winzip is just a user, can not post new standard, the standard-making powers still remain in the hands of PKware. When in 2002, PKware developed support 256-bit AES encryption PKZIP 5.0, but Winzip released in 2003 Winzip 9 was proved not compatible. Both sides have accused each other against the spirit of free and open Zip each other. This is the Zip from the date of birth, for it is the first most serious challenges.

header

using any text editor to open the Zip file, can see the first two letters: PK

Technical

ZIP is a fairly simple respectively compressed archive format for each file. Respectively, without reading the compressed file to allow additional data to retrieve separate files; Theoretically, this format allows the use of different algorithms for different files. No matter what method, a caveat of this format is to include many small files, archiving significantly than compressed into a separate file (in Unix-like systems is a classic example of an ordinary tar.gz archive is a TAR archive using gzip compression of the composition) is larger.

statute ZIP file pointed or may not be compressed using different compression algorithms stored. However, in practice, ZIP almost always in use almost Katz (Katz) of the DEFLATE algorithm.

ZIP supports a simple password-based symmetric encryption system, now known to have serious flaws, known plaintext attack, dictionary attacks and violent attacks. ZIP also supports sub-volume compression.

In a recent period of time, ZIP adding new features include new compression and encryption methods, but these new features are not supported by many tasks and has not been widely used.

Compression

is used to compare the size of the compressed content is used [1] and the maximum compression ratio.

  • Shrinking (Method 1)

  • shrinkage (Shrinking) is a variant minor adjustment of LZW, also by the LZW patent impact of the problem. Never clear is whether the patent covering anti-shrinkage, but some open-source projects (such as Info-ZIP) decided to proceed with caution, do not contain anti-shrinkage in support of the default constructor.

  • Reducing (Method 2-5)

  • reduction (Reducing) comprising a repeating sequence of bytes of the compressed composition is then applied a probability based on the coding of the result.

  • Imploding (Method 6)

  • implosion (Imploding) comprises using a sliding window compression repeat sequence of bytes, and Shannon-Fano tree using multiple compression result.

  • (Method 7)

  • number of the tokens (tokenizing) of tokenizing is reserved. PKWARE no statute of an algorithm for its definition.

  • Deflate and Enhanced Deflate (Method 8 and 9)

  • Deflate methods using well-known algorithms. Deflate allows a maximum of 32K window. Enhanced Deflate allows a maximum of 64K window. Slightly enhanced version of the successful completion of a number of tasks, but has not been widely supported.

  • Deflate size comparison is 52.1MiB (using pkzip for Windows, version 8.00.0038 Test)

  • Enhanced Deflate compare size 52.8MiB (using pkzip for Windows, version 8.00.0038 test)

  • PKWARE Data Compression Library Imploding (method 10)

  • PKWARE data compression library implosion (PKWARE data compression library imploding), official ZIP format statute this did not give further details.

  • Comparative size 61.6MiB (using pkzip for Windows, version 8.00.0038 test, binary mode is selected)

  • Method 11

  • this method is retained PKWARE.

  • Bzip2 (Method 12)

  • This method uses a well-known algorithm bzip2. This algorithm is efficient but has not been supported by more than deflate (based on the Windows platform) tool.

  • compare dimension is 50.6MiB (using pkzip for Windows, version 8.00.0038 Test)

disadvantages

Since the time appeared in the market as early as today, compared with other Zip file compression format has many shortcomings can not be ignored.

native does not support Unicode file names, easily lead to the sharing of resources difficult part, especially significant resource exchange in East Asian Culture; compression ratio can not be compared with the 7z well as WinRAR Repair support of Recovery Record also it is lack of reasons for its decline.

Referring

  • WinZip

  • LZW

  • LZ77

  • RAR

  • compression software listing

  • compression software Comparative

Related Articles
TOP