True semantics tagcloud
This site is aproduct of approx 100 hour of work. I began from scratch. My first lines of PHP code ever was the code for creating a menu. I have used online resources only and managed to create a simple CMS system so i can publish or edit posts. The design of this site is a bit mess. Im still working on it, implementing the CSS-aproach.
Basicly after configuring the main publishing engine, a category system and a comment feature i decided to publish the site and give it a go.
The most fancy feature up to date is automatic keyword generation. The system uses a special class which fetches words based on frequency in a given article. Then the array is printed to the "keywords" meta-tag of each article. The goal ahead is building a sort of tagcloud, but not traditionally, by manually putting tags to articles, but fully automatically.
I believe that automating the tag-cloud feature is the right way to do it. So i plan to write a class, which will do exactly that. It is my strong belief that folxonomy is wrong. It is good for putting more tags to online content, and so more content can be "indexed" by tags and as such grouped. But the problem is relevancy.
To get a good approximisation of a truly semantic categorisation, grouping, segmentation or indexing I intuitively put more trust to a automated statistically supported system, which first extracts most often used words in single articles and lately joins different articels through same, statistically relevant keywords.
I will indeed model the approach mathematically as i would like to put it on a more firm basis. But first things first, lets go ahead an build the automated tag cloud. Later i shall fine tone it and make it a measurable system and try to find a way to compare the results with a "folxonomy" type tag cloud.
The major expected difference is: a keyword(tag) is not assigned by associative reasoning (human) but is achieved by statistically significant (relevant) factors. To say it directly: i assume that there is more information in how (using which words, in what frequency or order) one uses words than in what one chooses to describe with them.
By manually adding tags to articles or pictures, one only chooses what one thinks is relevant enogh to categorise the content. But automatic tagging on the other hand, can provide the true meaning of the content, by providing information on how content is structurised and not what it is.
Does that make sense?
September 8, 2008, 21:00 | admin said:
i promise i will start on thiss one soon, infact ASAP :)
November 22, 2008, 12:41 | admin said:
funny footer word list stuff... is the making of the semantic cloud
December 3, 2008, 21:21 | andrej said:
before any funky new features: database restructure, cms remake,.. ;) much funky stuff
March 3, 2009, 23:09 | andrej said:
i really feel like a complete goof by now :) but before i can make a fancy semantic cloud feature, this site)s cms needs to be rewritten in its entirety. Btw im a super SEO/SEM magician now.
May 31, 2009, 09:45 | Andrej said:
yeeeey. the restructuring is complete, ill go on with the semantic tagcloud now
September 27, 2009, 18:51 | Andrej said:
"I intuitively put more trust to a automated statistically supported system"
the statistics behind this is Bayes probability






Discussions
on this article- Andrej:
- Andrej:
- andrej:
- andrej:
- admin:
"I intuitively put more trust to a automated statistically supported system"the ...
yeeeey. the restructuring is complete, ill go on with the semantic tagcloud ...
i really feel like a complete goof by now :) but before i can make a fancy ...
before any funky new features: database restructure, cms remake,.. ;) ...
funny footer word list stuff... is the making of the semantic cloud