UnZip
          Locale Issues
        
        
          
          
            Note
          
          
            Use of UnZip in the JDK, Mozilla, DocBook or any other BLFS package
            installation is not a problem, as BLFS instructions never use
            UnZip to extract a file with
            non-ASCII characters in the file's name.
          
         
        
          The UnZip package assumes that
          filenames stored in the ZIP archives created on non-Unix systems
          are encoded in CP850, and that they should be converted to
          ISO-8859-1 when writing files onto the filesystem. Such assumptions
          are not always valid. In fact, inside the ZIP archive, filenames
          are encoded in the DOS codepage that is in use in the relevant
          country, and the filenames on disk should be in the locale
          encoding. In MS Windows, the OemToChar() C function (from
          User32.DLL) does the correct
          conversion (which is indeed the conversion from CP850 to a superset
          of ISO-8859-1 if MS Windows is set up to use the US English
          language), but there is no equivalent in Linux.
        
        
          When using unzip to
          unpack a ZIP archive containing non-ASCII filenames, the filenames
          are damaged because unzip uses improper conversion
          when any of its encoding assumptions are incorrect. For example, in
          the ru_RU.KOI8-R locale, conversion of filenames from CP866 to
          KOI8-R is required, but conversion from CP850 to ISO-8859-1 is
          done, which produces filenames consisting of undecipherable
          characters instead of words (the closest equivalent understandable
          example for English-only users is rot13). There are several ways
          around this limitation:
        
        
          1) For unpacking ZIP archives with filenames containing non-ASCII
          characters, use WinZip while running the Wine Windows emulator.
        
        
          2) After running unzip, fix the damage made to the
          filenames using the convmv tool (http://j3e.de/linux/convmv/). The
          following is an example for the ru_RU.KOI8-R locale:
        
        
          
            
              Step 1. Undo the conversion done by unzip:
            
            
convmv -f iso-8859-1 -t cp850 -r --nosmart --notest \
    </path/to/unzipped/files>
            
              Step 2. Do the correct conversion instead:
            
            
convmv -f cp866 -t koi8-r -r --nosmart --notest \
    </path/to/unzipped/files>
          
         
        
          3) Apply the optional unzip-5.50-alt-iconv-v1.1.patch patch to
          UnZip. It will apply with some
          offsets.
        
        
          It allows to specify the assumed filename encoding in the ZIP
          archive using the -O charset_name
          option and the on-disk filename encoding using the -I charset_name option. Defaults: the on-disk
          filename encoding is the locale encoding, the encoding inside the
          ZIP archive is guessed according to the builtin table based on the
          locale encoding. For US English users, this still means that unzip
          converts from CP850 to ISO-8859-1 by default.
        
        
          Caveat: this method works only with 8-bit locale encodings, not
          with UTF-8. Attempting to use a patched unzip in UTF-8 locales may result
          in a segmentation fault and is probably a security risk.
        
       
      
        
          Installation of UnZip
        
        
          Note that if you applied the patch described above for locale
          issues, the first required security patch will have some offsets.
          Now install UnZip by running the
          following commands:
        
        
patch -Np1 -i ../unzip-5.52-security_fix-1.patch &&
patch -Np1 -i ../unzip-5.52-security_fix-2.patch &&
make -f unix/Makefile LOCAL_UNZIP=-D_FILE_OFFSET_BITS=64 linux
        
          To test the results, issue: make
          check.
        
        
          Now, as the root user:
        
        
make prefix=/usr install
       
      
        
          Command Explanations
        
        
          linux: This target in the
          Makefile makes assumptions that are
          useful for a Linux system when compiling the executables. To obtain
          alternatives to this target, use make
          list
        
        
          LOCAL_UNZIP=...: This sets
          the compilation flags to allow UnZip to handle files up to 4 GB.