Download List

项目描述

docx2txt is a tool that attempts to generate equivalent text files from (even corrupted) Microsoft .docx documents, preserving some formatting and document information (which MS text conversion drops) along with appropriate character conversions for a good (ASCII) text experience.

It is a platform independent solution consisting of (core) Perl and (wrapper) Unix/Windows shell scripts and a configuration file to control the output text appearance to fair extent. It depends upon a commandline unzipping program (like unzip, 7z, pkzipc, or wzunzip) that can silently extract single files from zip archives to console/standard output/pipe.

It can very conveniently be used to build a Web based docx document conversion service. Some Makefiles and Windows batch files are provided for easy installation of the scripts. With unzippers like CakeCmd that can deal with corrupt Zip archives, this tool can extract text from corrupt docx documents in many cases, where MS word processor fails to even open them.

系统要求

System requirement is not defined
Information regarding Project Releases and Project Resources. Note that the information here is a quote from Freecode.com page, and the downloads themselves may not be hosted on OSDN.

2009-09-06 16:43 Back to release list
0.4

超链接显示的是可配置的。目录相关清理已完成。许多新的字符转换得到执行。字符转换表增加了。货币字符转换为全货币名称。代码进行调整,都是为加快转换过程。
Display of hyperlinks is configurable. TOC related cleanup was done. Many new character conversions were implemented. Character conversion tables were added. Currency characters are converted to full currency names. Code tweaks were done to speed up the conversion process.

Project Resources