Boilerpipe

Boilerpipe

From Free Software Directory

Revision as of 11:59, 16 April 2018 by Bendikker (talk | contribs)

(diff) ← Older revision | Approved revision (diff) | Latest revision (diff) | Newer revision → (diff)

Jump to: navigation, search

[edit]

Overview
Details
About this entry

http://code.google.com/p/boilerpipe
Boilerplate removal and fulltext extraction from HTML pages

The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page.

The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings.

Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate.

Download

http://ftp.debian.org/debian/pool/main/b/boilerpipe/boilerpipe_1.2.0.orig.tar.gz

Licensing

License

Verified by

Verified on

Notes

License

Other

Verified by

Debian: Emmanuel Bourg <ebourg@apache.org>

Verified on

20 June 2013

Notes

License: apache-2.0

Leaders and contributors

Contact(s)	Role
Christian Kohlschütter	contact

Resources and communication

Audience	Resource type	URI
Python (Ref)		https://pypi.org/project/boilerpipe
	Download	http://code.google.com/p/boilerpipe/
Ruby (Ref)		https://rubygems.org/gems/boilerpipe
Debian (Ref)		https://tracker.debian.org/pkg/boilerpipe

Software prerequisites

Date	2015-07-17
Source	Debian
Source link	http://packages.debian.org/sid/boilerpipe

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the page “GNU Free Documentation License”.

The copyright and license notices on this page only apply to the text on this page. Any software or copyright-licenses or other similar notices described in this text has its own copyright notice and license, which can usually be found in the distribution or license text itself.

Retrieved from "https://directory.fsf.org/wiki?title=Boilerpipe&oldid=67907"

Hidden category:

Entry

Free Software Foundation!

Boilerpipe

Boilerpipe

Download

Categories

Licensing

License

Verified by

Verified on

Notes

License

Verified by

Verified on

Notes

Leaders and contributors

Resources and communication

Software prerequisites

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Interaction

Navigation

Creation

Print

Tools