Paste text or link here

Links must point to a .txt file






Pre-weighted words (optional; comma-separated):

Loading input (may take a while)

We'll try to grab the text as fast as we can.

The more text there is, the more time it will take.

How does the program work?

— The algorithm cleans up the input text so that it can be analyzed.


— Then, it finds the frequency of each word in the cleaned up text.


— Each word is assigned a score based on a simple TF-IDF analysis.


— Based on the scores of the words within them, a score is calculated for each sentence.


— So that longer sentences aren't favored and shorter sentences aren't punished, the sentence scores are then normalized by length.


— The sentences are sorted by their scores.


— Finally, depending on what type of output is asked for, the program spits out the results.

What does this do?

The program tries to pick out the sentences of an input text that are most representative of the text as a whole; that is to say, find the essence of a text.


Where can I get texts?

Project Gutenberg is an excellent resource for full books in the public domain.


Try copying any of these links into the input box above


"I Have A Dream"

Pride and Prejudice

The Jungle Book

An Inquiry into the Nature and Causes of the Wealth of Nations

The Iliad

Robinson Crusoe

2011 State of the Union Address

Who made this?

Peter Downs

— peter.l.downs (at) gmail

@peterldowns


With what?

Python, web.py, NLTK, JQuery, 1140.css, sexybuttons, vim, and Adobe Photoshop.

Why?

I'm interested in computational linguistics. It's interesting to consider what exactly makes a sentence important, and if it's even possible to find an objective measure of 'meaningfulness'.


Like This?

If so, you should let people know!