W3Cschool
恭喜您成為首批注冊(cè)用戶
獲得88經(jīng)驗(yàn)值獎(jiǎng)勵(lì)
您可以將 Apache 非結(jié)構(gòu)化信息管理體系結(jié)構(gòu)(UIMA)與 Solr 進(jìn)行集成。通過 UIMA,您可以定義分析引擎(Analysis Engines)的自定義管道,以增量方式將元數(shù)據(jù)作為注釋添加到文檔中。
SolrUIMA UpdateRequestProcessor 是一個(gè)自定義的更新請(qǐng)求處理器,它將文檔編入索引,將它們發(fā)送到 UIMA 管道,然后返回富集了指定元數(shù)據(jù)的文檔。要為 Solr配置 UIMA,請(qǐng)按照下列步驟操作:
<lib dir="../../contrib/uima/lib" />
<lib dir="../../dist/" regex="solr-uima-\d.*\.jar" />
<field name="language" type="string" indexed="true" stored="true" required="false"/>
<field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false"/>
<field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />
<updateRequestProcessorChain name="uima">
<processor class="org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory">
<lst name="uimaConfig">
<lst name="runtimeParameters">
<str name="keyword_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="concept_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="lang_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="cat_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="entities_apikey">VALID_ALCHEMYAPI_KEY</str>
<str name="oc_licenseID">VALID_OPENCALAIS_KEY</str>
</lst>
<str name="analysisEngine">/org/apache/uima/desc/OverridingParamsExtServicesAE.xml</str>
<!-- Set to true if you want to continue indexing even if text processing fails.
Default is false. That is, Solr throws RuntimeException and
never indexed documents entirely in your session. -->
<bool name="ignoreErrors">true</bool>
<!-- This is optional. It is used for logging when text processing fails.
If logField is not specified, uniqueKey will be used as logField.
<str name="logField">id</str>
-->
<lst name="analyzeFields">
<bool name="merge">false</bool>
<arr name="fields">
<str>text</str>
</arr>
</lst>
<lst name="fieldMappings">
<lst name="type">
<str name="name">org.apache.uima.alchemy.ts.concept.ConceptFS</str>
<lst name="mapping">
<str name="feature">text</str>
<str name="field">concept</str>
</lst>
</lst>
<lst name="type">
<str name="name">org.apache.uima.alchemy.ts.language.LanguageFS</str>
<lst name="mapping">
<str name="feature">language</str>
<str name="field">language</str>
</lst>
</lst>
<lst name="type">
<str name="name">org.apache.uima.SentenceAnnotation</str>
<lst name="mapping">
<str name="feature">coveredText</str>
<str name="field">sentence</str>
</lst>
</lst>
</lst>
</lst>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
您需要注意以下幾點(diǎn):<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">uima</str>
</lst>
</requestHandler>
一旦您完成了配置,您的文檔將在索引時(shí)自動(dòng)豐富指定的字段。
有關(guān) Solr UIMA 集成的更多信息,請(qǐng)參閱:https://wiki.apache.org/solr/SolrUIMA。
Copyright©2021 w3cschool編程獅|閩ICP備15016281號(hào)-3|閩公網(wǎng)安備35020302033924號(hào)
違法和不良信息舉報(bào)電話:173-0602-2364|舉報(bào)郵箱:jubao@eeedong.com
掃描二維碼
下載編程獅App
編程獅公眾號(hào)
聯(lián)系方式:
更多建議: