2012 Search Computing Course

Course Description

Topics

This course deals with the new technologies and applications that characterize the Web, seen as a large information system; the course can be seen as one of the many possible continuations of Database 2. The core of the course is information retrieval - a subject which is not covered in the basic data management courses - and Web information retrieval, the key technology of search engines. Such core will cover classical aspects: text processing, index structures, classic data retrieval methods, retrieval evaluation, search engine technology (crawling and indexing), the PageRank and Hits methods, models of advertising.

This course evolves every year; the central topic addressed by the 2012 edition of the course is Human Computation, i.e. the involvement of humans in computational processes typically performed (unsuccessfully, or slowly) by machines. The course will provide an exhaustive introduction to this emerging field, including topics related to human computation, games with a purpose, and crowd-searching.

Given that we expect projects to design and partially prototype Web applications, the first three lectures of the course will be dedicated to a recapitulation of technologies for building web-based applications, both at the server- and client-side.

Format

The format of the course is atypical and experimental. Students will be asked to participate to small projects (for 3 credits), that will be performed in teams of 2-3 members. Although students will be free to select their own project, we encourage some of the projects to focus on Expo 2015, a forthcoming event in Milano, and use human computation, games with a purpose and crowd-searching; the requirements and expectations/visions on Expo 2015 will be presented and coached by members of the Expo 2015 Team. Students will be asked to define their project rather early, and then the professor and tutors will monitored them through follow-up and feedback, using the format which has proven to be successful in Alta Scuola Politecnica. The expectation is that a few of the ideas prototyped within projects could be continued beyond the course conclusion.

In addition, students will be asked to perform readings out of a reading list of numerous papers, and then to present (mostly orally, in some cases in written form) their personal interpretation of the reading; presentations or papers will contribute to the evaluation for the residual 2 credits. The reading list will also be provided at the beginning of class.

A maximum of 30 students will be evaluated with this method, determined on a FIFO basis. Should some students be excluded by this constraint, they will be graded through a conventional written exam followed by an oral discussion relative to all the lessons of the course and to some of the student's presentations. By using extra-time on Monday (the class ends at 6 but we will extend some lectures until 7) the course will be completed by December; by doing short presentations at 3 times, students will be driven toward a completion of their project work in due time and minimizing the risk of failure – this happened last year with 28 students.

Schedule

Lesson Date Lecturer Material
1. Course Intro / Human Computation and CrowdSearching 01/10/2012 Ceri

Slides (~7Mb)

Draft Book Chapter (~2.5Mb)

2. Human Computation: Market and Human Factors 02/10/2012 Brambilla Slides (~ 3.5Mb)
3. Applications of Human Computation and Games With a Purpose 08/10/2012 Bozzon / Della Valle

Slides (~ 30Mb)

Urbanopoly Presentation

Urban Match Presentation

4. Applications of Human Computation and Games With a Purpose

09/10/2012 Bozzon  
5. Requirements and Visions for Expo 15/10/2012 Expo2015 Slides (~6Mb)
6. Client-side scripting: Javascript, Rich Internet Applications, HTML5 16/10/2012 Bozzon

Slides (~3Mb)

Overview of HTML5 APIs

7. Crowdsearch Framework 22/10/2012 Bozzon

Slides (Google Doc)

CrowdSearcher Web Site

Contacts and Bug Report

8. Project Brief – 5 minutes description of each group 23/10/2012 Students / Ceri / Bozzon  
9. Javascript 29/10/2012 Bozzon Slides (~1.3Mb)
10. Foundations of Information Retrieval 30/10/2012 Ceri

Slides (~2.8Mb)

Draft Book Chapters (~1.5Mb)

11. Project Plan – 10 minutes description of each group 05/11/2012 Students / Ceri / Bozzon  
12. Foundations of Information Retrieval 06/11/2012 Ceri  
13. Web Information Retrieval 12/11/2012 Ceri  Slides (~2.6Mb)
14. Web Information Retrieval 13/11/2012 Ceri  
15. Data publishing 26/11/2012 Della Valle  Slides (~9Mb)
16. Project Review – 10 minutes description of each group - Motivation 27/11/2012 Ceri/Bozzon/Students  
17. Semantic Search 03/12/2012 Della Valle Slides (~11.1)
18. Data streams and reasoning with orderings 04/12/2012 Della Valle  
19. Project Midterm Review - possibly requiring extra-time 10/12/2012 Students / Ceri / Bozzon  
20. Data integration for search 11/12/2012  Ceri Slides (~13Mb) 
21. Economic drivers of search 17/12/2012 Brambilla Slides (~5Mb) 
22. Multimedia Information Retrieval 18/12/2012 Bozzon

Slides (~5Mb) 

Draft Book Chapter

23. Student’s lectures 1 07/01/2013    
24. Student’s lectures 2 08/01/2013    
25. Student’s lectures 3 13/01/2013    

Where and When

Classroom: D2.3

  • Monday, 16.15 - 18.15
  • Tuesday,12.15 - 14.15

Readings

  • "Will MOOCs Destroy Academia?", Moshe Y. Vardi. (Link)
  • "Experiments in Social Computation",Michael Kearns, (Link)
  • Less than 4% of students in an MIT online course passed the final. Why investors in education are throwing their money away (Link) (Original Article)