Skip to main content

Normalizer

Introduction#

This guide describes steps that are required to use the normalizator, a service for normalization of texts in Slovenian language. Normalizer is a service that can be used to convert digits, dates, abbreviations, acronyms etc. into their full spoken forms. For instance, the abbreviation “dr.” would be normalized as “doktor”, since “dr.” is usually read that way. Many tokens that need normalization can however be read in different ways, depending on the context and/or user preferences.

Please contact us to obtain credentials (username & password) for testing purposes.

Normalization#

The text must be provided inside request's body as a JSON object:

{   "text":"Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta 1954, je, da je temperatura trojne točke vode enaka 0,01 °C."}

The response contains normalized text and the status of normalization:

{   "normalizedText":"Sodobna definicija Celzijeve temperaturne lestvice, ki velja od leta tisoč devetsto štiriinpetdeset, je, da je temperatura trojne točke vode enaka nič celih nič ena stopinje Celzija.",   "status":1}

Statuses#

  • -2 Problem with normalization of at least one sentence
  • -1 Problem with normalization of at least one token
  • 0 Normalization not needed
  • 1 Normalization successful
  • 2 A type of token occured that needs normalization, but is not covered yet