When I finished the blog features on this site, the next big thing on my list was developing a way to index and search the contents. I didn't want to let Google do it for me, because that's like cheating. I had heard good things about Lucene (actually great things, I know a guy who uses it for everything from searching for products in a web store to indexing biological information), so I started looking into that.
I was happy when I discovered Hibernate Search, a library that simplifies mapping a Hibernate domain model to Lucene's searching capabilities. I was even happier when I discovered that Hibernate Search has excellent support for JPA, which is the API I used for this site. In the rest of this post I'll outline the steps I had to take to add Hibernate Search to this site. It ended up being much easier and more intuitive than I had expected...
-
I downloaded the Hibernate Search JARs and documentation from http://search.hibernate.org, and added them to my project. This PDF is an excellect source of documentation for using the Hibernate Search library with either JPA or the default Hibernate interfaces.
-
I added the following Search properties to the Hibernate configuration. Since I'm using JPA, I added these properties to the persistence-unit element in persistence.xml (if you're using classic Hibernate, these would get added to hibernate.properties or hibernate.cfg.xml).
<property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSDirectoryProvider" /> <property name="hibernate.search.indexing_strategy" value="manual" /> <property name="hibernate.search.default.indexBase" value="/var/searchindex" />
The value of "hibernate.search.default.indexBase" is the filesystem location where the generated indexes will be stored.
-
I added Hibernate Search annotations to my JPA entities so that the Search library can properly index and search them. Here's an example of my Song class, before and after adding the annotations.
Before:
@Entity @Table(name="song") public class Song implements Serializable { private static final long serialVersionUID = 1L; // yeah, lazy, I know @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name = "id") private Long id; @Column(name = "name", length = 100, nullable = false) private String name; @Column(name = "lyrics", length=5000, nullable = true) private String lyrics; // other attributes and methods }
After (new annotations added at the top):
@Indexed @Entity @Table(name="song") public class Song implements Serializable { private static final long serialVersionUID = 1L; // yeah, lazy, I know @DocumentId // the equivalent of @Id for the Hibernate Search indexes @Id @GeneratedValue(strategy = GenerationType.AUTO) @Column(name = "id") private Long id; @Field(index = Index.TOKENIZED, store = Store.NO) @Boost(value = 2.0f) // boosts the importance of the song name when searching @Column(name = "name", length = 100, nullable = false) private String name; @Field(index = Index.TOKENIZED, store = Store.NO) @Column(name = "lyrics", length=5000, nullable = true) private String lyrics; // other attributes and methods }
-
I added an "updateIndexes" method to generate and optimize the indexes in every DAO that represents a searchable object. Calling this method makes any new content searchable. Since Hibernate Search has JPA interfaces, the indexing work is done using a FullTextEntityManager which just builds off of a normal EntityManager.
@Transactional public class SongDAO extends JpaDaoSupport implements SongService { @SuppressWarnings("unchecked") public void updateIndexes() { final List
songs = getJpaTemplate().find("select s from Song s"); getJpaTemplate().execute(new JpaCallback() { @Override public Object doInJpa(EntityManager em) throws PersistenceException { FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em); for (Song song : songs) { fullTextEntityManager.index(song); } fullTextEntityManager.getSearchFactory().optimize(Song.class); return null; } }); } // other methods } Whenever I need to update the indexes, the UpdateIndexesController simply calls this method:
this.songService.updateIndexes();
and every song is indexed.
-
Finally, to actually perform the search, I added a "search" method to the DAOs to return a list of objects that meet the search criteria. What's fantastic is that the objects returned are attached JPA objects just like you would get from a normal EntityManager query.
@Transactional public class SongDAO extends JpaDaoSupport implements SongService { @SuppressWarnings("unchecked") public List
search(final String searchString) { final String[] fields = new String[]{"name", "lyrics"}; // search on these fields return (List ) getJpaTemplate().execute(new JpaCallback() { @Override public Object doInJpa(EntityManager em) throws PersistenceException { List results = new ArrayList (); try { FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em); MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, new StandardAnalyzer()); parser.setDefaultOperator(QueryParser.AND_OPERATOR); // overrides the default OR_OPERATOR, so that all words in the search are required org.apache.lucene.search.Query query = parser.parse(searchString); Query fullTextQuery = fullTextEntityManager.createFullTextQuery(query, Song.class); results = fullTextQuery.getResultList(); } catch (Exception e) { throw new PersistenceException(e); } return results; } }); } } All the search Controller has to do is pass the search string to the search method, and it gets a list of results back:
List
songs = this.songService.search(performSearchCommand.getSearchString()); This list contains all the Songs that match the criteria, ordered with the best matches first.
So there you have it — the major steps for using Hibernate Search in an application that uses JPA.