wv is a library which allows access to microsoft word files. It can load and parse the word 2000,97,95 and 6 file formats. These are the file formats known internally as word 9,8,7 and 6. wv compiles and works under most operating systems, particularly Linux, Solaris, AIX and OSF1. It is (to my knowledge) very portable and I have reports that it can be compiled under windows using cygwin32, and that it can compile under AmigaOS VMS and OS/2 with varying levels of success.
wv allows other programs access to word documents for the purpose of converting them to other formats, it is currently being used by Abiword as its word importer. I have written a sample application that uses the library named wvHtml. wvHtml converts word documents to html 4.0 for viewing in a webbrowser, wvHtml makes some use of stylesheet’s to give a close conversion to the original layout, while retaining a logical html structure. There are always some word capabilities which cannot be converted cleanly, for example whitespace, so keep this in mind when reviewing the html output from a word document, I’m particularly interested in features which microsoft word’s own html converter can handle better than wvHtml.
The library is available under the GPL license, and you can download it here, I have an online conversion gateway setup so that you can test it without downloading it, you can also submit word documents that crash the library, or that you feel are converted incorrectly here also.
wvWare got me nominated into the top 100 nominees for the 1999 Free Software Award, which was incredibly cool, thanks to the demented individuals who voted me that far.