This is best illustrated by an example: Document doc = new Document() ĭoc.add(new StringField("id", "Hotel-1345", )) ĭoc.add(new TextField("description", "A beautiful hotel", )) Lucene Document is basically a container for a set of indexedįields. You use the Lucene Document class, to which you add theįields that you want indexed. Now you need to index your documents or business objects. The StandardAnalyzer as the document analyzer. You will use for the index as we did before. When you createĪn IndexWriter, you have to specify which Analyzer Standard ones often do the job well enough. It isn't difficult to implement your own analyzer, though the There are even a number of language-specific analyzers, including analyzersįor German, Russian, French, Dutch, and others. Roots (a search on rain should also return entries Removes common English words that are not usually useful forĪn interesting experimental analyzer that works on word AnalyzerĪ sophisticated general-purpose analyzer.Ī very simple analyzer that just separates tokens using Several types ofĪnalyzers are provided out of the box. Your data into indexable "tokens" or keywords. The job of Analyzer is to "parse" each field of Most likely, the data that you want to index by Lucene is plain textĮnglish. More details on lucene analyzers follow shortly.ġ.2 Analyzer Class: Parsing the Documents Here, weĪre using the StandardAnalyzer for this purpose. The version of our Lucene library (4.10.2) and the "documentĪnalyzer" to be used when Lucene indexes your data. Second parameter specifies the "configuration" of our index, which are Will be created, which is index-directory in this case. Note that IndexWriter takes two parameters, indexDir and config, which are Directory and IndexWriterConfig objects, respectively. IndexWriter indexWriter = new IndexWriter(indexDir, config) IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_4_10_2, new StandardAnalyzer()) You can createĭirectory indexDir = FSDirectory.open(new File("index-directory")) Used to create the index and to add new index entries To create an index, the first thing that need to do is to createĪn IndexWriter object. Now let us get into details on how this is done. When you're doneīuilding a Document, you write it to the Index using Into chunks and store the chunks in the DocumentĪs Field objects (a name/value pair). Whatever), instantiate a Document for it, break down the data Is, you read in each data file (or Web document, database tuple or The Index, and it is your job to "convert" your data intoĭocument objects and store them to the Index. Into the Index, then do searches on the Index to get Here's a simple attempt to diagram how the Lucene classes goĪt the heart of Lucene is an Index. The first step in implementing full-text searching with Lucene is to build an Get yourself familiar with the overall structure of the code. Stored in src/lucene/demo/business/HotelDatabase.java)Īnd performs a simple keyword query on the data using the index.īriefly go over the two java source files, Indexer.java and (the actual data is provided by the Hotel class APACHE LUCENE DEMO CODESrc/lucene/demo/Main.java has a test code thatīuilds a Lucene index using a small dataset Src/lucene/demo/search/SearchEngine.java is responsible for Src/lucene/demo/search/Indexer.java is responsible forĬreating the index. APACHE LUCENE DEMO HOW TOPart, we learn how to use the prebuilt index to answer userįor your convenience, all of the code for this article's Lucene This tutorial, we learn how to create a lucene index. Up the prebuilt index to answer the query. Objects and (2) parsing the user query and looking (1) creating a lucence index on the documents and/or database Roughly, supporting full-text search using Lucene requires two steps: In this tutorial, a Hotel has a unique identifier, a The main business object is the HotelĬlass. Search functionality to a fairly typical J2EE application: an onlineĪccommodation database. Tutorial, we'll go through the basics of using Lucene to add full-text (Microsoft Office documents, PDF, HTML, text, and so on). You can use Lucene to provide full-text indexingĪcross both database objects and documents in various formats Lucene is an extremely rich and powerful full-text search library
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |