Uni. of L'pool Logo

SCHOOL OF DATA MINING

CENATAV, Havana, Cuba

Frans Coenen

8-18 January 2007

CENATAV Logo

The first Cuba-UK (Liverpool) School of Data Mining took place at CENATAV in Havana, Cuba, on January 8-18, 2007; a joint effort between the University of Liverpool and the Advanced Technologies Application Centre (CENATAV). The event was directed at all those interested in techniques of data minimg in general and multimedia data mining in particular with the intention establish a dialog bto foster ongoing academic collaboration between data mining researchers in Cuba and the United Kingdom.

Location: CENATAV, 7a # 21812 e/ 218 y 222, Rpto. Siboney, Playa, Ciudad de la Habana, Cuba, Phone: (+537) 272-1670, Fax: (+537) 273-0045




SCHEDULE

Hours: 9:00-13:00 (Coffee break: 11:00 - 11:30).

DAYSEMINARSEMINAR
Monday 8Introduction: Aim's and objectives, some assumptions about the target audience, data mining in thE UK and The University of Liverpool, and overview of "what's to come". Data:Data sources, UCI data repository, IBM QUEST data generator, LUCS-KDD ARM data generator (demonstration of software), normalisation and/or discretisation for (a) ARM, (b) CARM (demonstration of LUCS-KDD-DN software).
Tuesday 9Association Rule Mining 1: Association Rule Mining (ARM) --- What is it? Brute force Algorithmm (software demonstration), the Apriori Algorithm, the T-tree and the Aprori-T Algorithm (demonstration of Apriori-T). Association Rule Mining 2: Lattices and the negative boarder approach, Dynamic Itemset Counting (DIC), software demonstration of negarive boarder and DIC algorithms, organising the input data, FP-trees and the FP growth algorithm, P-trees and the TFP algorithm, demonstration of FP growth and TFP algorithms.
Wednesday 10Association Rule Mining --- The Wider Picture: Vertical v. Horizontal data, Maximal Frequent Itemsets (Max-Miner), Frequent Closed Patterns (CloSet), understanding your association rules (ordering, clustering, visualisation), emerging (jumping) patterns, more ARM, and future directions. Association Rule Mining For Very Large Data Sets: Overview of distributed and parallel ARM, partitioning and segmentation (especially vertical partitioning), xperiments with the Distributed Apriori-T Algorithm (DATA), and mining VLDB using partitioning and segmentation.
Thursday 11Classification Association Rule Mining 1: The Classification problem, Classification Association Rules (CARs), some popular Classification Association Rule Mining (CARM) algorithms, demonstartion of a number of CARM algorithms (FOIL, PRM, CPAR, CMAR and CBA), evaluation strategies. Classification Association Rule Mining 2: Fast and effective Classification Association Ruke mining --- the Total From Partaila Classification (TFPC) approach. software demlonstration.
Monday 12Multi-Agent Data Mining (MADM) 1: What is an agent? (features, categorisation, advantages and disadvantages), Multi-Agent System (MAS) Technologies, a Multi-Agent Data Mining (MADM) vision, some research issues, thoughts on ARM in a MADM setting, MADM operation, some more thoughts (argumentation and the semantic Web). Multi-Agent Data Mining (MADM) 2 --- Incremental Daata Mining: The challenge of incremental ARM (I-ARM), some I-ARM algorithms (FUP, AFPIM, EFPIM, NFUP, CATS trees, CAN trees), incremental TFP algorithm, more thoughts.
Tuesday 13Text Mining 1: Text mining techniques, text representation, and bench mark data sets. Text Mining 2: Text mining research at Liverpool, (a) Identification of significant words and phrases, (b) experiments using different phrase based mining approaches, (c) software demonstration.
Wednesday 14Multi-Media Data Mining (MMDM): Current research directions at Liverpool, multimedia Data Mining (MDM). Image Mining, the LUCS-KDD random image generator, image pre-processing for Data Mining (image representation) --- the tesseral and quad tree representations. Image Mining 1: Issues with image representations, (a) Image pre-processing software demonstration (tesserak and Quad-tree representations), (b) image mining demonstartion (tesseral and quad-tree representations), (c) graph mining and time series analysis, (d) Dynamic Time Warping (software demonstartion), (e) "BlobsÓ, (f) concept graphs and (g) segmentation.
Thursday 15Image Mining 2: Research work at Liverpool, image primitives (what are they?), comparing primitives, similarity matrices and lattices, similarity weighting, comparitor grid, software demonstration. Summary and Conclussions: Summary of seminar series highlighting main ideas, review of possible future directions relating to individual topics discussed, main findings.



Official Photo

Delgates at School Of data Mining, 15 January 2007