— DBPEDIA spotlight for KAF/NAF

If you are interested in extracting entities and their link to dbpedia entries, you should take a look to this module: https://github.com/rubenIzquierdo/dbpedia_ner It allows you to use a KAF or a NAF file with jus tokens and terms, calls to the DBPEDIA online webservice and extract entities and the link to dbpedia automatically.

Given this portion of text of a NAF file:

    …….

    <wf id=”w678″ sent=”29″ para=”1″ offset=”3745″ length=”4″>lung</wf>

    <wf id=”w679″ sent=”29″ para=”1″ offset=”3750″ length=”6″>cancer</wf>

     ……

     <!–lung–>

    <term id=”t678″ type=”open” lemma=”lung” pos=”N” morphofeat=”NN”>

      <span>        <target id=”w678″/> </span>

   </term>

    <!–cancer–>

    <term id=”t679″ type=”open” lemma=”cancer” pos=”N” morphofeat=”NN”>

      <span> <target id=”w679″/> </span>

This module would generate this entity (among others):

    <entity id=”e62″ type=”DBpedia:Disease”>

      <!–lung cancer–>

      <references>

        <span>

          <target id=”t678″/>

          <target id=”t679″/>

        </span>

      </references>

      <externalReferences> 

        <externalRef resource=”spotlight_cltl” reference=”http://dbpedia.org/resource/Lung_cancer” confidence=”0.9994825438215322″/>

      </externalReferences>

    </entity>

Leave a Reply

Your email address will not be published.