Improving Modulus Search Algorithm

[GCI-27] ixWDE4eR

  1. First customization I'd like to propose is Richard Szczerba's, essentially, but rather than completely turning off description indexing, I've just given name more weight.
  2. I tried sorting solely by date updated, which - expectedly - did not work that great. Having it sort results with an equal score by some other parameter seems like a good idea, but I was unable to figure out how to do it.
  3. Sort by revelance works fine, but (1) is better, I believe.

 

[GCI-27] cshah

As seen in other locations on this page, modulus orders results alphabetically initially. To customize this, I first prioritized description rather than name. This made the search go through and find results more comprehensively. Rather than only showing a few results, it showed many more, relevant results. The wildcard '?' also helped when searching for a group of things. The second customization was removing both name and desc from the array. This resulted in an equal priority for both, and the results were listed alphabetically. Finally, I decided to test sorting by date updated and created, and these results were not as effective as when I used name/description. The search with description prioritized proved to yield the most results in an organized order – the screenshots are shown below.

[GCI-27] Parker Erway

Initially, Modulus seems to order results alphabetically:

First, I tried sorting by relevance:

Second, I tried placing the boost on the description (leaving the relevance sort):

Third, I tried ordering by date updated:

 

In my opinion, the first option gives the best result. 

[GCI-27] SquidDev

Editing the searchable property

Grails models have a searchable property, which allows you to change which fields are searchable. The Module class currently has 'name' and 'description' set as searchable, with 'name' having a boost of 2.0. Changing this to a negative number, or setting a larger boost for description means a module is ranked higher if its description contains a string than if its name contains the same string.

Using dateCreated format: "yyyyMMdd" allows you to filter between dates (dateCreated:[201401* TO 201406*] would allow you to get all releases between January and June 2014). 

Searching

Some elements of Lucene do not work as expected because Modulus changes some strings, to include wildcards. Forcing complex: true in Modulus-UI enables you to use these features.

Fuzzy searchingLucene's fuzzy searching allows you to to match strings similar to the specified one. forms~ matches matches forms and form.

Specifying fields and wildwards: name:s*tion shows any module whose name starts with s and ends with tion (serialization, syncronization).

 

[GCI-27] Richard Szczerba

Initial: When searching for "report" we recive the Reporting module as the last one:

  1. Firstly,  I sorted by the amount of downloads each module had:

2. The Reporting Module still apperead at the end because the deafult sorting order is "asc" (smallest -> largest) and I changed it to "desc" (largest -> smallest):

The Module bubbled right up to the top. However when using wildcards, like ?, XForms was number one:

3. This was because we were searching the description as well. By disabling the description we got: