Implementing meaningful phrase finding with Spark

Phrase finding is an interesting problem to solve: Given a bunch of text, what are the most “interesting” phrases present in the text? We just have a lot of GBs worth of text, and we want to somehow extract meaningful phrases out of it. Additionally, there is no supervised data.

More …