Download List

项目描述

docx2txt is a tool that attempts to generate equivalent text files from (even corrupted) Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience.

It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It depends upon a commandline unzipping program (like unzip, 7z, pkzipc, or wzunzip) that can silently extract single files from zip archives to console/standard output/pipe.

It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

系统要求

System requirement is not defined
Information regarding Project Releases and Project Resources. Note that the information here is a quote from Freecode.com page, and the downloads themselves may not be hosted on OSDN.

2011-12-13 07:28 Back to release list
1.1

未成年人非提取功能增强和错误修正,基于的反馈输入从用户收到的。检查解压命令的存在。配置文件被家的 $HOME,以及。以 config_ 现在开始配置变量。已修复 bug #3003903、 #3082018 和 #3082035。这个软件的 null 设备已得到修复。上标的交叉引用现在放在 [...] 内。
标签: Minor feature enhancements and bug fixes
Minor non-extraction feature enhancements and bugfixes, based on the feedback/input received from users. A check for the existence of the unzip command.
The configuration file is looked for in $HOME as well. Configuration variables now begin with config_ . Bugs #3003903, #3082018, and #3082035 have been fixed. The null device for Cygwin has been fixed. Superscripted cross-references are placed within [...] now.

Project Resources