Using AI is costing more for Hindi speaking people, shocking information revealed

Show Quick Read

Key points generated by AI, verified by newsroom

Talking to AI in Hindi is costlier than English.
Hindi prompts require more tokens than English prompts.
Due to this, Hindi users have to spend more.
AI models are more trained on English data.

AI Cost: If you talk to AI in Hindi then it is costing you more than English. Even though companies like OpenAI, Anthropic and Google talk about equal access to their AI models, it is expensive to use AI in languages other than English, including Hindi and Arabic. A data has revealed that using AI in any language except English is an expensive deal.

What is the reason for this?

The reason for this lies in the processing of the AI model. If you understand in simple language, you will have to spend more tokens for Hindi language prompt than English. Token refers to the unit that AI systems use to read or understand a text. This means that you will need less tokens to say something in English, whereas more tokens will be used to say the same thing in Hindi. Researchers and developers are calling this method ‘language tax’. This is also being seen as a hidden cost of processing different languages.

What is the difference in cost of using Hindi and English?

Several weeks ago, OpenAI researcher Aran Komatsuzaki conducted an experiment to compare how OpenAI and Anthropic’s Tokenize handles text in different languages. The results revealed that Hindi text on OpenAI required 1.37 times more tokens than English. On Anthropic Cloud, Hindi text had to use 3.24 times more tokens than English. Similarly, Arabic required 2.86 times more tokens and Chinese required 1.71 times more tokens. This means that for the same amount of information an English speaking user is spending the budget of one token, the Hindi user has to spend 1.5 to 3.3 times the budget of token for the same amount of information. The same is happening with other languages also.

…but why is this happening?

Before an AI model understands a prompt, it converts that text into small units called tokens. This process is completed by a component called tokenizer. Now since most of the models are trained on English data, they understand English easily. Other languages, including Hindi and Arabic, require breaking them into separate scripts and structures, which requires more tokens. Experts say that to avoid this, companies should train models in different languages.

Read this also-

Why is the touchpad on the left instead of the center in many laptops? Very few people know the answer to this

Source link

admin

Leave a Reply Cancel reply

Related News

Why is the touchpad on the left instead of the center in many laptops? Very few people know the answer to this

Before buying a solar panel, know which one has the longest life between Thin-Film or Monocrystalline?

Will the look of iPhone 18 Pro change? Apple will do this work to improve the camera setup

Is your every click on the Internet changing your thinking? New research raised questions