Showing posts with label free software. Show all posts
Showing posts with label free software. Show all posts

Sunday, January 23, 2011

DjVu: viable, Free alternative to PDF? convert .txt to .djvu

djview4First, a bit of ranting about open standards and free file formats:
Okay, you know I'm always harping about using Open Document Formats.
So, on the LibreOffice user list today there was discussion of a viable Free/Open alternative to .pdf files. After all, PDF is, indeed, a proprietary format, owned by Adobe, and it is ubiquitous, and there really should (must, perhaps), be a free, open alternative. As such, someone on the list mentioned DjVu, which, frankly, I'd never looked at before (I had heard of it, but knew not what it was). It's a free/open file format that was initially created for scanned documents, from what I gather, and has been around since the late 80s, still maintained by the original authors, and is now used for all kinds of gro0vy stuff.
I did a bit of research, googling, apt=cache searching, and poking around. Eventually, I aptitude installed djview4 and djvulibre and experimented a little. I have drawn the conclusion that, yes, in my opinion, DjVu would be an excellent candidate to be used as, in fact, a better option for many reasons, for the purposes .pdf currently serves (a portable document format that preserves formatting, essentially). Works great.

But there IS a rather glaring drawback...
The one big drawback is, conversion tools are lacking.
One can not, for instance, simply write a DjVu file in any kind of document editor, as you can write a pdf with many different editors, web browsers, most office software, LaTeX editors, and basic text editors, such as tcltext, and, frankly, even in a command line interface.
But to create DjVu, you can only convert other files to DjVu.
Then, in general, and this is what most irritates me, it seems you have to convert from non-free formats. There are no tools, for instance, to convert directly from plain text, LaTeX (.tex), .odf (.odt), .png, or even html files to a .dvju file. What's worse, is that all of your Free and/or open source browsers, document editors, etc., will export or print a file to .pdf, but not to .djvu. OpenOffice.org will write a .pdf. LibreOffice, and Abiword will write a .pdf. LaTeX editors will write a .pdf....Everybody will write a .pdf, but nobody has written code to write a file directly to .djvu. In my opinion, that needs changing. We need to use open standards and free/open file formats (all kinds of reasons for that discussed in this entry to this blog).

That said, today I wrote a script to convert a plain text file to DjVu (but, yes, I had to round-trip it through .pdf, darn it).
This script was written on a Debian/Stable (lenny at the time of this writing) system, on AMD64 arch, using all tools available in the lenny repos.
It requires (obvious when you read the script) enscript, ps2pdf, and pdf2djvu (part of dvjulibre).
The script first converts your text file to postscript with enscript, the from postscript to pdf, with, surprise, ps2pdf, and, then, the final step of converting to .djvu.

The script looks like this:
#!/bin/bash

if [[ $(echo $*) ]]; then
text="$*"
else
echo "try again, and include a file name, and ONLY 1 file name at a time. Thank you." && exit
fi

echo converting $text to $text.ps

enscript $text -q -B -p $text.ps

echo converting $text.ps to $text.pdf

ps2pdf $text.ps

echo converting $text.pdf to $text.djvu

pdf2djvu $text.pdf -o $text.djvu

echo renaming ...

rename.ul .txt.djvu .djvu $text.djvu

echo cleaning up ...

rm $text.ps $text.pdf

echo done

exit


I actually turned the script on itself, and created a DjVu file of this text, available here.

With this, I may very add the capacity to export a .djvu file to tcltext. Why not? It's just a shame, imho, that such an export is not direct, without having the cross into proprietary territory via .pdf, in order to be accomplished.

Also, as a gift to my fellow freedom fighters, foss hackers, and open standards supports, I have created a DjVu of my poetry here which contains all the poems published in my recent book (but not the paintings and photographs).

And, this full article in djvu format here. This last was fun, because I ended up having to change the text encoding first. Apparently enscript doesn't like utf8. I had copy/pasted the article into tcltext, which generates utf8 here (system default). I made a .dvju that had all these weird character substitutions (like /200a#blahblah for a quotation mark?). Here's how to handle the conversion.

iconv iconv -f utf8 --to-code=ascii//TRANSLIT yourfile > newfile

Now, if you use firefox or some other mozilla derivative, there's actually a plugin for view such files in your browser, included in the djvulibre packages.. Otherwise, you'll need a djvu viewer, such as djview or evince.

Anyway,
Enjoy.

./tony

Tuesday, February 23, 2010

Yes! You, too, can use Free Software and Succeed as a Freelance Translator

This past weekend new versions were released of two Free software programs very important for translators, OmegaT, CAT program (Computer Aided Translation), and Anaphraseus, another CAT program, both Free (as in speech) and free (as in beer).
OmegaT, developed in Java, is the CAT program is most used by translators in the Free Software community, and has been used in translation and localization of other important Free Software projects such as OpenOffice.org, the complete, Free, office suite. It is rather distinct from other CAT programs, broadly useful, with ample functions and the ability to deal with a wide variety of files formats, including all those most common to the translation industry, such as all MSOffice® file formats, various software localization formats, and, of course, all Open Document Format files. In addition, OmegaT works with the standard translation memory format, TMX (Translation Memory eXchange).
Anaphraseus CAT works similarly to another, proprietary CAT program, Wordfast®, in its earlier incarnations, but as a macro in OpenOffice.org, not with MSOffice®, as does Wordfast. Anaphraseus developed in StarBasic, is important because it allows translators who are users of free software to provide their customers "unclean" .doc or .rtf files, a bilingual word processing file (containing both, the source and target languages), widely used in the translation industry. With both these tools, translators using only free software are able to compete with those who work with proprietary products that dominate the industry. Both programs are cross-platform, able to run in GNU/Linux, Mac or Windows.
I announced the release of these new versions over the past several days, but today, I'm taking the time to elaborate again on these release, because I believe these programs are extremely important. I've already discussed why I believe open document formats are important at some length, but it is a topic I am likely to revisit, and my original article touching on the matter is, as I see it, a work in progress. I'm certain I will continue to revise and update that article and repost it from time to time. Why freedom of information and open standards are important in my industry, translation, should, as I see it, require little explanation.
Now, my industry, translation, like so many others, is dominated by the use of propietary software tools, such as Trados® and Wordfast@, and inundated with the widespread use of MSOffice®. That's no surprise and no secret. Many translators, in fact, believe that you simply can't work successfully in our industry without MSOffice® and Trados® or Wordfast®, and I'm living proof that the notion is completely erroneous. I've been working as a freelance translator now for half a decade, and using only Free Software on my computers for a full decade, and my family eats three square meals a day. My three most used programs are the above mentioned, OmegaT, Anaphraseus, and OpenOffice.org (the 4th being a web browser, for research and to communicate with clients, providers, etc., and fifth being mocp to listen to music while I work. Seriously. But that's a matter for another article). I work for private clients, government agencies, school systems, and large translation warehouse agencies, the vast majority of whom use the popular proprietary products mentioned above. I've never had any difficulty due to lack of compatibility, and have always been able to deliver the product that my clients have demanded of me. Furthermore, it is my belief that I can do so more efficiently using the Free Software I use, especially since I use them with a GNU/Linux operating system. My system is secure, stable, and efficient. It uses fewer resources than popular proprietary operating systems, doesn't fall prey to the hordes of viruses and attacks to which those other systems are so easily and frequently prey, has never crashed on me (seriously, not once), and is far more customizable and configurable, allowing me to set it up in the way that is more "ergonomic" and efficient for me, allowing me to work as efficiently as possible. I save time, not having to deal with AV software updates, fixing crashes, removing intrusions, etc. Heck, I never even have to reboot the darned thing. Another factor, and, in my opinion, this is probably the least important, but often the most touted in some circles, is that none of my software has cost me a penny. Seriously. I have powerful CAT tools and office tools for my translation work, all the web communication tools needed (e-mail, chat, voip), tools for managing the financial back end (some day I should write an article on gnucash), powerful image manipulation software (sometimes I edit images for clients), essentially, everything I need for my work. (I also have all the toys, games, multimedia software, etc., I could possibly ever not need to distract me when I should be working...).
A common proprietary operating system, cat program, and office suite, alone, would cost me in the neighborhood of US$1500.00. Proprietary image manipulation software would easily tack on another $700, and, let's not forget that I'd have to pay for security tools to protect all my data, with regular AV updates, etc. I could easily spend US$3000.00 or more for the software I would need to do the work that I do, were I to use proprietary software tools. So, I'm not only more efficient in terms of time/energy waste maintaining my machine (able to focus more on work than maintenance...except when I'm blogging or facebooking), I'm also more efficient in terms of expenditure of financial resources, which enables me to pass the savings on to my clients, making, in fact, more competitive than my colleagues who use proprietary software tools.
Now, do I use Free (as in speech) Software just because it's free as in beer)?
No. For me, the issues of freedom of information and open file format standards, and the freedom to control my own computer (not be licensed to use a product over which I have little control, and in a fashion that gives its creators rights over the software on MY machine) are FAR more important to me than price. In addition, the added efficiency and configurability I have with the Free Software I use are convenient and agree with me immensely. Nonetheless, I do feel that it's worth mentioning the added financial advantage these tools bring.
With that, I will get back to work translating these Brazilian articles, and bid you good day.
./tony

Friday, February 12, 2010

Spread the LOVE! (and the source code)

The Free Software Foundation has started a campaign to spread some love the Free Software developers (like me!) this year for St. Valentine's Day.