Week 11 moved into recommender systems, perhaps one of the most popular and commonly used forms of AI. Sites such as Google and Amazon built their success on the effectiveness of their recommender systems (now I guess their brands can carry them for a while). The first topic of the lecture was association mining, given a large dataset, how do we find useful associations between attributes.
Support and confidence were proposed as useful metrics to drive this process. Unfortunately we found some conflicting definitions among Data Mining, Weka and R&N texts. When in doubt check wikipedia..:
The support supp(X) of an itemset X is defined as the proportion of transactions in the data set which contain the itemset.
supp(Z) = P(Z)
The confidence of a rule is defined –
The lift of a rule is defined as –
The leverage of a rule is defined as –
leverage(X -> Y) = P(X and Y) – (P(X)P(Y))
The source listed in wikipedia for these definitions is: http://michael.hahsler.net/research/association_rules/measures.html
In our lecture notes we had support for rule a -> b as the union of A and B, this confused me as I still think that support is the intersection of A and B.
The rules described above are quite intuitive when work through in an example. Lift feel like and extension of confidence taking into account independence. An increase in lift implies dependence.
No leverage implies independence between attributes and vice versa.
This topic was closed with the conclusion that it is in fact bad practice as variance and standard deviation are completely ignored.
A quick review of collaborative and content based filtering were covered next. Content Based Filtering [CBF] (haha) can be implemented using an array of machine learning techniques already covered. Naive Bayes, Neural Networks and Decision trees are classification methods that can be applied to CBF. The pre-processing involved with CBF seems to be the most limiting factor. Term frequency and Inverse Document term frequency can be compile into tables allowing for effective searching. Considering the vast size of the data sets that these systems would be applied to, this can seem a bit daunting.
Collaborative filtering [CF] seems a bit easier to implement but does then rely on user participation. The introduction in the lecture felt very similar to the basics of Self Organising Maps. Vectors are created to represent instances (in this case users). Euclidean distances (or some spin-off of this) are used to measure instance ‘likeness’ then missing values for instance vectors can be predicted based on the instances that are considered ‘like’. There was quite a bit of mathematical methodology described on the lecture slides which would be required when implementing a CF system.