SPA Conference - Session


SPA Conference session: Building Fast Search Application In a Day
One-line description:	A hands on introduction to building a search application in a day

Session format:	Long tutorial (330 mins) [read about the different session types]

Abstract:	In this workshop you will build a search application, and integrate a number of exciting features into in (eg auto-complete, faceting, did you mean?) and see how you can optimise search to suit your needs. We will be using Solr/Lucene. Solr joined the Apache incubator in 2006 and since then has rapidly grown in popularity and capability. Solr in a highly scalable, fast, open-sourced enterprise search platform from the Apache Lucene project. Participants will learn how to: set up and configure Solr; import data from a SQL DB; and build a search solution. We will look at some of the more advanced features of Solr and how these can be used to enrich the search experience for your users. Also, we will look at performance tuning of queries and how you can use different forms of caching in Solr to improve search speed. The session will be run using Java in Eclipse, but people can feel free to work in any language they like. If you wish to work in a language other than Java it may be best to work in a language for which a Client library already exists: Java, C#, Ruby, Perl, PHP, Python, Scala and JavaScript. We have used Solr with Java and C#; in other languages we will not be able to offer as much assistance.

Audience background:	Some experience developing web application. If you don't want to run off a boot-able flash drive, please bring a laptop with Eclipse, MySQL and Tomcat pre-installed.

Benefits of participating:	* Build a search solution from scratch * An awareness of search problem and how to solve them * Understand document DBs Pros and Cons * How Solr fits into existing architectures * Extended features of Solr and Lucene

Materials provided:	- Dataset - Boot-able flash drive with Ubuntu with Tomcat, Solr, MySQL.

Process:	Introduction - Document DBs vs RDBMS - Solr search capabilities - Solving search problems Time for questions Then working in pairs or individually: - Check machines are set up - Import RDBMs data into Solr - Configure Solr - Import data into Solr - Write an API/App/Web-page to search the dataset - Tuning Solr search performance - Implement search suggestions/spelling corrections - Look faceting, caching and feature such as EdgeNgram filtering Final half hour to tidy up search products, ask questions and get feedback.

Detailed timetable:	00:00 00:15 Set up machines & data 00:15 - 00:30 Introduction to Solr 00:30 - 00:45 Solr in infrastructure/architecture 00:45 - 01:00 Q&A 01:00 - 01:15 Import data into Solr 01:15 - 02:00 Build Basic Search Product 02:00 - 02:10 Spelling Correction explained 02:10 - 02:30 Add in Spelling correction 02:30 - 02:40 Faceting explained 02:40 - 03:20 Add in Faceting correction 03:20 - 03:30 break 03:30 - 04:15 Add in Search suggest/Auto complete 04:15 - 04:30 Tuning caches 04:30 - 05:00 Tuning Solr queries 05:00 - 05:30 Further Q&A and Feedback

Outputs:	Search application with: spelling corrections faceting auto-complete

History:

Presenters
1. James Atherton 7digital	2. Greg Sochanik	3.

SPA Conference session: Building Fast Search Application In a Day