Free Software Foundation!

Join now

Semantic search

This entry published by the Free Software Foundation.

[Edit query]| Show embed code


Previous     Results 1– 20    Next        (20 | 50 | 100 | 250 | 500)


BHL BHL is an Emacs mode that lets you convert plain TXT files into HTML, LaTeX, and SGML (Linuxdoc) files. The BHL mode handles common font-styles, three levels of sections, footnotes, and any kind of lists, tables, URLs and horizontal rules. It also handles a table of contents: you can browse the toc, insert the toc where you want, and update the sections' numbers with one keystroke.

Beautiful Soup Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. Three features make it powerful:

  • 1. Beautiful Soup won't choke if you give it bad markup. It yields a parse tree that makes approximately as much sense as your original document. This is usually good enough to collect the data you need and run away.
  • 2. Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. You don't have to create a custom parser for each application.
  • 3. Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don't have to think about encodings, unless the document doesn't specify an encoding and Beautiful Soup can't autodetect one. Then you just have to specify the original encoding.
  • Beautiful Soup parses anything you give it, and does the tree traversal stuff for you. You can tell it "Find all the links", or "Find all the links of class externalLink", or "Find all the links whose urls match "foo.com", or "Find the table heading that's got bold text, then give me that text." Valuable data that was once locked up in poorly-designed websites is now within your reach. Projects that would have taken hours take only minutes with Beautiful Soup.

Bib2html 'Bib2html' converts the data in a BibTeX database to HTML files. Please note that there is another package by the name 'bib2html' (http://directory.fsf.org/bib2html.html) written by Kiri Wagstaff.

Docvert Docvert takes word processor files (typically .doc) and converts them to OpenDocument and clean HTML. The resulting OpenDocument is then optionally converted to HTML or any XML. This is done with XML Pipelines, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (e.g., chapter1.html, chapter2.html…). The result is returned in a .zip file.

Genshi Genshi is a Python library that provides an integrated set of components for parsing, generating, and processing HTML, XML or other textual content for output generation on the web. The main feature is a template language that is smart about markup: unlike conventional template languages that only deal with bytes and (if you're lucky) characters, Genshi knows the difference between tags, attributes, and actual text nodes, and uses that knowledge to your advantage.

Grutatext Grutatxt is a plain text to HTML converter. It successfully converts subtle text markup to lists, bold, italics, tables, and headings to their corresponding HTML tags without having to write unreadable source text files.

HTML Code Convert HTML Code Convert helps speed up the conversion of HTML code into different format including Java Script, JavaServer Pages, PHP, Perl, and the UNIX Shell. It is particularly useful in CGI scripting.

HTML Merge HTML::Merge is an embedded HTML/Perl/SQL tool used to create dynamic Web content. It uses TAG-based embedded Perl and SQL integration in templates that are used to automatically generate Perl code, which is run in the deployment mode.

Html2fo Converts files from html to xml:fo formats. The HTML code can be written with StarOffice or other WYSIWYM editors and need not be 100% valid; you will get some sort of output even with badly formatted code. The program supports tables and internal and external links.

Html2pdf HTML_ToPDF takes the hassle out of generating a PDF file from a Web page. It will convert any HTML document into a format that will look the same on any platform and printer. It includes support for converting images, using the stylesheets to customize the look of the PDF file, and error handling.

Html2xhtml Html to Xhtml Convertor converts HTML pages into XHTML pages. It can process batches of files, convert line breaks, and deal with attribute minimization, quoting of attribute values, and more.

Hypermail Hypermail 2 is a much enhanced version of the popular tool that converts mails into correctly formatted HTML pages. Version 2 has many new features including MIME support. Perfect for archiving mailing lists and similar.

LibreOffice LibreOffice is the power-packed personal productivity suite for GNU/Linux (as well as Windows & Macintosh) that gives you six feature-rich applications for all your document production and data processing needs: Writer, Calc, Impress, Draw, Math and Base.

There are also a good and growing number of free software extensions and templates available.

Markdown 'Markdown' is a text-to-HTML conversion tool that lets you write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML). Thus, 'markdown' is both a plain text formatting syntax and a software tool, that converts the plain text formatting to HTML. The overriding design goal for Markdown's formatting syntax is to make it as readable as possible. Ideally, a Markdown-formatted document should be publishable as-is, as plain text, without looking like it has been marked up with tags or formatting instructions. The single biggest source of inspiration for Markdown's syntax is the format of plain text email.

Mll2html Heckert gnu.small.png Reformats an ASCII file, containing a description of mailing lists, into a more convenient HTML file. For each ftp, http and news address, a hyperlink is generated pointing to that address. A hyperlink is also generated for each email address. The 'mll2html' program parses a template file, writing it to the output file, to detect predefined tags, which are replaced by the corresponding part of the mailing lists. A template file (the file that is copied to the output file) can contain one or more of these tags, and a tag can be used more than once. This project has been decommissioned and is no longer developed.

Otl otl is intended to convert a text file to a HTML or XHTML file. It is different than many other text-to-HTML programs in that the input format (by default a simple highly readable plain text format) can be customized by the user, and the output format (by default XHTML) can be user-defined. It can process complex structures such as ordered and unordered lists (nested or not), and add custom "headers" and "footers" to documents. The conversion utilizes Perl regex, adding quite a bit of flexibility and power to the conversion process. Since both the syntax of the source file and of the output can be readily customized, otl in theory can be used for many types of conversions. The package also includes tag-remove, a script for stripping HTML/XHTML-ish tags from documents.

Outl 'outl' is a glorified search and replace engine. It can be used to generate markup such as HTML, XML, or whatever from a text file. Users specify which elements to look for using perl regex as well as how to deal with those elements when replacing them. 'oult'is designed to process outlines (e.g., lecture notes, etc.) and generate nice HTML, but it could be used in other contexts as well by defining a new set of rules for the substitution process. It can deal with relatively complex structures involving multiple lines, nested lists, etc. It doesn't deal with table construction, etc. but doesn't mind if the text file being processed already contains xml-like elements.

PHPMyEdit PHPMyEdit generates PHP code for displaying/editing MySQL tables in HTML. It includes sorting, filtering, table lookups, and more. It can be used with any data that firs in a single MySQL table.

Pf2x 'pf2x' is a PHP script that will take the output of your pflog and convert it into various different output formats. These output formats include plain text, XML, HTML, PDF, and MySQL INSERT statements for import into a MySQL database.

Ppm2html 'ppm2html' converts an RGB PPM image to an HTML page. The page will display an approximated version of the image only using ASCII characters and color definitions. The tool makes full use of 24 bit color in HTML. If a 'source string' is being defined this string will repeatedly be used as the character source for rendering the ASCII output. If omitted, random characters between 'a' and 'z' will be used. The 'red', 'green', and 'blue' values define the background color of the HTML page. If no command line arguments are given, the page is rendered black. User will get the best results by converting segmented objects (the background of the image is replaced by a single color) and keeping the level of detail low. A small blur on the image results in a nicer visual appearance, especially when displaying text. Note that output varies among different browsers; in particular, horizontal scaling might differ a lot. Using unique renderings for each browser yields the best results.

Previous     Results 1– 20    Next        (20 | 50 | 100 | 250 | 500)

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software described in this text has its own copyright notice and license, which can usually be found in the distribution itself.


The FSF is a charity with a worldwide mission to advance software freedom — learn about our history and work.

Copyright © 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc.

Licensed under the GNU Free Documentation License, version 1.3 or later.

The FSF also has sister organizations in France, Latin America, Europe and India.

Powered by MediaWiki and Semantic MediaWiki

Toolbox