Text is the new data
Structured data is so 2006.
Or at least it should be. I look at 2005 as the year when the online database revolution first started to pick up steam. It hadn't quite reached critical mass, but the case that data is valuable in its own right was finally finding some ears.
Now here we are, a few years later, and newspapers have embraced the database. You can argue that they haven't made the best use out of data, or that data ghettos and salacious Caspio databases aren't the way to go, but at least reporters and editors ...
-
Text mining: more than a just neat new toy
Making the case for applying emerging technologies. Read | 1 comments
-
SEASR, text mining and journalism
Starting a conversation on text mining in journalism. Read | 88 comments
-
Applying Benford's Law to CAR
In which I explain the advantages of using a popular fraud-detection tool in reporting. Read | 13 comments
Links
- From the Height of This Place Google's Jonathan Rosenberg opines about the future of the Internet. (1 comments)
- Automate Metadata Extraction for Corporate Search and Mashups These guys say what I've been trying to say, but smarter. (1 comments)
- Rules of database app aging Great points. The question is: How do we build around these issues? (5 comments)
- If You Like This, You're Sure to Love That A great story that explains some of the logic that goes into applied machine learning. (3 comments)
- The Future of Data Analysis A case for the Bayesian paradigm of data analysis, which is useful reading if you're interested in AI. (0 comments)
- The Commoditization of Massive Data Analysis Two great articles on the future of data analysis today. This one deals with scalability issues and tools like Hadoop. (0 comments)
- OCR documents for free using Google Google now automatically indexes the text of scanned PDFs. (0 comments)
