This version of the API is deprecated.

See text.relatedTexts for the current version of this method.

match.getSimilarArticles

Searches your Corpus for articles that are similar (semantically like) to your source article.

Request Parameters

Corpus IDint (required)

The unique id for the corpus where your source article exists.

Article IDint (required)

The id for your source article. The one you want to find similar articles to.

waitint

Maximum time you want to wait for a result. Maximum is 120 sec. If you set 0 you will only start the process and disconnect and you can check out the result later.

Max Number of Resultsint (default: 10)

The max number of results (similar articles) you want in your result set. (default is 10, max is 50)

Min Thresholddecimal (default: 0.0)

If you only want similar articles that are 50 % like your source article, provide 0.5 as param.

Max Thresholddecimal (default: 1.0)

If you only want similar articles that are like your source article to a max of 90 %, provide 0.9 as param.

Response Parameters

matchIdint

A unique id combined with corpusId and articleId for the match result.

resultCorpusIdint

Corpus id for where the similar articles we found are stored.

resultArticleIdint

Article id for the matching article in the result corpus.

resultValuedecimal

The value for how like the result article are to the source article. The value is between 0 and 1 where 1 is an exact match. The higher value the more similar.

Code Examples

JSON


//Request
{"method":"match.getSimilarArticles","params":[45, 1, 15, 25, 0.80, 0.99],"id":0}


//Response
{"id":0,"result":[{"resultCorpusId":45,"resultArticleId":1130,"matchId":1,"resultValue":0.80919},{"resultCorpusId":45,"resultArticleId":453,"matchId":2,"resultValue":0.80353}]}


Notes

  • After getting the result articles, check your local database for the result articles (if you have stored the ids and articles locally).
  • After getting the result run corpus.getArticle to fetch result article.
  • When running similarArticles the first time all articles in the corpus is being indexed, and therefor it might take some time before getting results back.