Show Changes Show Changes
Edit Edit
Print Print
Recent Changes Recent Changes
Subscriptions Subscriptions
Lost and Found Lost and Found
Find References Find References
Rename Rename
Administration Page Administration Page
Search

History

9/13/2007 10:51:50 AM
-217.117.80.2
12/30/2006 3:53:10 PM
-81.182.199.17
12/28/2006 2:54:21 PM
-81.182.198.61
12/28/2006 2:53:22 PM
-81.182.198.61
12/28/2006 2:46:12 PM
-81.182.198.61
List all versions List all versions

RSS feed for the FlexWiki namespace

Text Mining Project Blog
.
Summary

Status report 2006.12.28.

I've told my teacher that on the long run, I want to implement all 4 scenarios, but what I'll do in short term depends on what he expects He told me to implement 2 of my choice.

For the first one, I've chosen the spam classification, because it seemed to be trivial... Well... Let's see what happened

Status report 2006.12.30.

I gave up the SVNLight spam classification of spam/notspam for now (Maybe the bag of words vector doesn’t correlate with the usefulness? Or I did something wrong? Or the training data is too noisy?). I thought it would be easier. Of course I had some tries before giving up:

But seeing that the parser works, I went on to the clustering, using textgarden. There are two clustering programs, one does a hierarchical binary clustering, the other a flat one. And, they can produce xml output, which made me very happy. And, they are working and fast. So I wrote XSLs to make use of them.

Today was a very long, tiring and also not too useful day... At least I refreshed my eclipse and xslt knowledge.

Not logged in. Log in

Welcome to the home of FlexWiki, an experimental collaboration tool, based on WikiWiki.

This is FlexWiki, an open source wiki engine.

This site supports the new NoFollow anti-spam initiative.
Change Style

Recent Topics