Jahia in English > Indexing with Jahia...

1 (1 Good)
0 (0 Bad)

Indexing with Jahia 6.1

by  almerysportail »  2013/08/16 14:18

Hello there,

I cannot figure out how to set the indexing policy with Jahia 6.1, even after setting several analyzer types in CND, altering applicationcontext-compass.xml, applicationcontext-indexationpolicy.xml or jahiaresource.cpm.xml. The Jahia/Compass/Lucene documents help a bit understanding the process but I still feel like a sorcerer's apprentice, then experienced advices would be more than welcome.

Actually I have two different issues : 

  • Numbers from contents are not indexed whereas numbers from files (PDF ...) are. I would like to consider numbers as part of the contents too.
  • I get sometimes unwanted page IDs in the search results, even for pages with key or vanity URL, even after I set a index="no" for the jahia.page_id property in jahiaresource.cpm.xml ...

Is there anybody here who knows about these issues ?

  • Indexing with Jahia 6.1
    2013/08/23 13:59

    almerysportail <p> Hello there,</p> <p> I cannot figure out how to set the indexing policy with Jahia 6.1, even after setting several analyzer types in CND, altering applicationcontext-compass.xml, applicationcontext-indexationpolicy.xml or jahiaresource.cpm.xml. The Jahia/Compass/Lucene documents help a bit understanding the process but I still feel like a sorcerer&#39;s apprentice, then experienced advices would be more than welcome.</p> <p> Actually I have two different issues :&nbsp;</p> <ul> <li> Numbers from contents are not indexed whereas numbers from files (PDF ...) are. I would like to consider numbers as part of the contents too.</li> <li> I get sometimes unwanted page IDs in the search results, even for pages with key or vanity URL, even after I set a&nbsp;index=&quot;no&quot; for the jahia.page_id property in jahiaresource.cpm.xml ...</li> </ul> <p> Is there anybody here who knows about these issues ?</p>

  • Number of messages  3
    Registration date Aug 16, 2013
    1 (1 Good)
    0 (0 Bad)

    Re: Indexing with Jahia 6.1

    by  pap@commaro.com »  2013/08/20 15:37

    Hello,

    I tested with Jahia 6.1.2.4 and the ACME demo. Searching for 1950 returns the About us page, which has 1950 in the content, so numbers indexing and searching works. Could be that you need to upgrade to the latest hotfix and re-index. If you need more information about the index configuration and have a support contract, please use the support platform.

    Regards,
    Benjamin
     

    Benjamin Papez (pap@commaro.com)

    Number of messages  220
    Registration date
    1 (1 Good)
    0 (0 Bad)

    Re: Re: Indexing with Jahia 6.1

    by  almerysportail »  2013/08/23 14:06

    Hi, 

    Jahia 6.1.2.4 fixes the number issue indeed (and also a couple of issues ...).

    Thank you very much, Benjamin !

     

    I still get unwanted page IDs in the search results and going to contact the support platform.

  • Re: Re: Indexing with Jahia 6.1
    2013/09/30 14:14

    almerysportail <p> Hi,&nbsp;</p> <p> Jahia 6.1.2.4 fixes the number issue indeed (and also a couple of issues ...).</p> <p> Thank you very much, Benjamin !</p> <p> &nbsp;</p> <p> I still get unwanted page IDs in the search results and going to contact the support platform.</p>

  • Number of messages  3
    Registration date Aug 23, 2013
    0 (0 Good)
    0 (0 Bad)

    Re: Re: Re: Indexing with Jahia 6.1

    by  almerysportail »  2013/10/22 13:38

    I still get unwanted page IDs in the search results and going to contact the support platform.

    The support buddy investigated that problem, which is not easy to solve in the Jahia backend, because the problem is in org.jahia.services.search.compass.LuceneResourceForHighLighting where the summary string is created from the individual container fields in the Lucene document. For Page/Link fields they store the ID, the title and url keys in the Lucene document. They have to store it to not break queries based on page-id. On the other side LuceneResourceForHighLighting does not know that a certain field is of page-type in order to ignore the ID in the result.
     
    Therefore they suggest that we make a modification in the template displaying the search results. Instead of using
     
    ${hit.summary}
     
    we can use something like
     
    <%=((org.jahia.engines.search.Hit)pageContext.findAttribute("hit")).getSummary().replaceAll("(^|\\.)\\d*\\s{40}...", "")%>
     
    This removes that number field from the summary.
  • Re: Re: Re: Indexing with Jahia 6.1
    2013/10/22 13:38

    almerysportail <blockquote> <p> I still get unwanted page IDs in the search results and going to contact the support platform.</p> </blockquote> <div> The support buddy investigated that problem, which is not easy to solve in the Jahia backend, because the problem is in org.jahia.services.search.compass.LuceneResourceForHighLighting where the summary string is created from the individual container fields in the Lucene document. For Page/Link fields they store the ID, the title and url keys in the Lucene document. They have to store it to not break queries based on page-id. On the other side LuceneResourceForHighLighting does not know that a certain field is of page-type in order to ignore the ID in the result.</div> <div> &nbsp;</div> <div> Therefore they suggest that we make a modification in the template displaying the search results. Instead of using</div> <div> &nbsp;</div> <div> ${hit.summary}</div> <div> &nbsp;</div> <div> we can use something like</div> <div> &nbsp;</div> <div> &lt;%=((org.jahia.engines.search.Hit)pageContext.findAttribute(&quot;hit&quot;)).getSummary().replaceAll(&quot;(^|\\.)\\d*\\s{40}...&quot;, &quot;&quot;)%&gt;</div> <div> &nbsp;</div> <div> This removes that number field from the summary.</div>

  • Number of messages  3
    Registration date Oct 22, 2013
    Contact
    Share
    Feedback

    Get in touch

    Whether you are a current user or if you are just evaluating Jahia, we are here to help.

    Contact us

    Share this page