Using AI is costing more for Hindi speaking people, shocking information revealed

Show Quick Read

Key points generated by AI, verified by newsroom

  • Talking to AI in Hindi is costlier than English.
  • Hindi prompts require more tokens than English prompts.
  • Due to this, Hindi users have to spend more.
  • AI models are more trained on English data.

AI Cost: If you talk to AI in Hindi then it is costing you more than English. Even though companies like OpenAI, Anthropic and Google talk about equal access to their AI models, it is expensive to use AI in languages ​​other than English, including Hindi and Arabic. A data has revealed that using AI in any language except English is an expensive deal.

What is the reason for this?

The reason for this lies in the processing of the AI ​​model. If you understand in simple language, you will have to spend more tokens for Hindi language prompt than English. Token refers to the unit that AI systems use to read or understand a text. This means that you will need less tokens to say something in English, whereas more tokens will be used to say the same thing in Hindi. Researchers and developers are calling this method ‘language tax’. This is also being seen as a hidden cost of processing different languages.

What is the difference in cost of using Hindi and English?

Several weeks ago, OpenAI researcher Aran Komatsuzaki conducted an experiment to compare how OpenAI and Anthropic’s Tokenize handles text in different languages. The results revealed that Hindi text on OpenAI required 1.37 times more tokens than English. On Anthropic Cloud, Hindi text had to use 3.24 times more tokens than English. Similarly, Arabic required 2.86 times more tokens and Chinese required 1.71 times more tokens. This means that for the same amount of information an English speaking user is spending the budget of one token, the Hindi user has to spend 1.5 to 3.3 times the budget of token for the same amount of information. The same is happening with other languages ​​also.

…but why is this happening?

Before an AI model understands a prompt, it converts that text into small units called tokens. This process is completed by a component called tokenizer. Now since most of the models are trained on English data, they understand English easily. Other languages, including Hindi and Arabic, require breaking them into separate scripts and structures, which requires more tokens. Experts say that to avoid this, companies should train models in different languages.

Read this also-

Why is the touchpad on the left instead of the center in many laptops? Very few people know the answer to this

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *