RTextTools: a machine learning library for text classification
  • Blog
  • About the Project
  • Install
  • How to Cite
  • Documentation

Preparing RTextTools Beta Release for Catania 2011

5/23/2011

0 Comments

 
Right now our development team is busy preparing a conference release of RTextTools for The 4th Annual Conference of the Comparative Policy Agendas Project at the University of Catania in Sicily. One of the key issues we've had thus far is memory consumption with very large datasets.

In the past week we've pushed out a slew of updates that allow the support vector machine and maximum entropy algorithms to run with low memory requirements, even on very large datasets. Unfortunately, not all the algorithms used in RTextTools support the changes we've made, so this leaves us with a two algorithm ensemble for low-memory classification. However, SVM and maxent tend to be the most accurate algorithms in our tests, meaning that a large ensemble isn't necessary to get high consensus accuracy.
0 Comments

Your comment will be posted after it is approved.


Leave a Reply.

    Developer Blog

    Updates on RTextTools progress, tips, and examples.

    ​Note: RTextTools is no longer actively maintained -- the software may contain bugs that will not be fixed with newer versions of R.

    By Author

    All
    Loren Collingwood
    Timothy P. Jurka

    By Date

    February 2012
    January 2012
    December 2011
    September 2011
    August 2011
    July 2011
    June 2011
    May 2011
    April 2011
    March 2011

    RSS Feed

Powered by Create your own unique website with customizable templates.