|
||||||
some data exploration toolsscroll down for list I started coming across these tools when searching for new ways to explore data sets – quickly review and visualize descriptive statistics and interrelationships between many variables. It was a huge added bonus that features where built into tools that helped with the entire data exploration process, from descriptive statistics to visualizations to predictive analytics and modeling. I first posted a list of 4 open source tools I had come across and downloaded (and played with to varying degrees) after someone at a Semantic Web NYC Meetup asked what tools were available. I also started a forum discussion on Data-mi.ning.com which generated some great feedback from a diverse group of professionals. You can follow the conversation here. These free tools are amazing, and very easy to misuse. I imagine a lot of people are going to get themselves in trouble using these tools without a great understanding of the nuances -the strengths and weaknesses of the underlying machine learning algorithms and statistical methods. Yet the power of these tools, for both professional and educational purposes should not be ignored. I should also state that I have not put these tools through any kind of QA. If you see any errors or have any additions to suggest please let me know. KNIME|Konstanz Information Miner based on the Eclipse platform, open source with GUI from the Chair of Bioinformatics and Information Mining at the University of Konstanz, Germany http://www.inf.uni-konstanz.de/bioml/ includes many plugins: Weka library, R scripting, Chemistry CDK and many more KNIME just started selling service contracts, which I don’t think any of the others here do… www.knime.org Weka This was the first tool I came across and downloaded. There was also a book available on data mining, now in its second edition, by the creators of the tool, obviously using it for examples. A HTML version of Graham William’s book on data mining, The Desktop Survival Guide, is available here. You can support their efforts by purchasing the book in PDF format here. ———— And I think for their efforts I will list an interesting development from Microsoft. Microsoft Excel 2007 is obviously not a free tool, but it is Excel. They have developed a free add-in, Table Analysis Tools, which brings some MS SQL Server data mining services to your desktop by leveraging cloud technology. I believe this is still not officially shipping Microsoft technology. There is an interesting story about how it came about on the creators blog here. To download click here. |
||||||