Product
OverviewVideo​Graphic​Document​
Enterprise
Story
LETR / TECH noteNews / Notice​
Pricing
En
한국어English日本語日本語
User guide
Getting started
한국어English日本語
한국어English日本語
How to respond to online colloquial language 2. Answers to “Wow Dang Dang Hae ❤️” selected by ChatGPT
2024-07-04

A Lively chat platform via Live translation

‍

Getting started

In order for AI to respond to online colloquialisms such as “boo dang boo hae ❤️”, two methods can be taken. The first is to refine anomalous data in batches. For example, in the example above, can you replace it with a refined sentence like “I love my dog.”

The advantages of this method are advantages. Most existing models are refined with refined grains. If anomalous expressions can be changed to refined expressions, existing models and data can be used. However, as advantages are, so are the advantages. The point is that the more anomalies, the harder it is to change to a refined expression.

As specific words or memes are widely used online, there is a suggestion for the number of derived words to increase, and the degree of change intensifies. There are places online where the word “dang dang” explains when a cat and dog are combined, or the word “golden-dang” is a word meaning a golden retriever. “Dang Dang” is also written as “Dang Dang” and “Dang Dang.” The key is whether all of these anomalies can be changed to “puppies,” and in the worst case, continuous online monitoring is determined to check for anomalies that appeared from time to time.

The second way to respond to anomalous online colloquy

Another option is to watch matches with huge highlights of data. And this is the method ChatGPT discussion. If a specific word argues in terms or tens of pieces of data, whether it's “dang,” it's possible to infer the meaning of the word based on that data.

The advantage of this method is that it can respond flexes to another anomaly, and the model can notice that affects to which the anomaly is applied are different from normal reactions. What I mean is that if you present online colloquial language to ChatGPT and ask them to say it in an easily read manner, you can recognise that the online colloquy is a sentence that is extremely strangely modified and find the original meaning. (Source: https://www.insight.co.kr/news/430720 )

The downside is that a huge amount of data and computing power is required to use this method. And this is why does it matter with online colloquialism is correct.

However, if there is a mountain, some people can climb it. Letter's discussion create a translator specialized in online colloquialism by engaging two techniques to propose a model with good performance even with little data. One such method is data augmentation (data augmentation). Data augmentation is a method of studying various learning data by studying an existing dataset in various ways. It is intentionally used in the field of computer vision (image processing). Even if the image is zoomed in or rotated slightly, the computer explains that the modified image is different from the original. Various methods can be varied, such as rotating (rotating), flipping (rotating), flipping (zooming), shifting (shifting), and changing the brightness or color of the image.

In comparison, the augmentation of data in languages is limited. This is because cats are cats when they are backwards, but languages are not.

“Hello” vs “Yose Ha Ning An”

In languages, four typical methods are used. The data is called by suspected specific words with synonyms (synonym replacement), random deletion/ random words (random deletion/ random swap), changing the position of any two words (random swap), or reverse translation (back translation).

However, not all four of these fit well into the Korean language. Since reverse translation is a method proposed in the translation project, it was left out of the question, and as an empirical experiment, it was found that the processes of RD (Random Swap) and RS (Random Swap) are appropriate for normal Korean language corpus. Special care must be taken when using the remaining SR (Synonym Replacement) or RI (Random Replacements). (Source: https://github.com/catSirup/KorEDA/tree/master/)

Here, the letter explains that online colloquialism can generate synonyms even with a difficult mechanical processing. Thanks to the meaning of using words with the same meaning in multiple expressions, data can be called rather big. Various types of noise were added to the original text to increase the size of the data. In addition to this augmented data, we were able to confirm dramatic results as a result of learning a customized translator by adding a special secret unique to the letter Both the Korean-Chinese translation model and the Korean-Japanese translation model surpassed all three other translation services.

‍

THE SPECIFIC WAY I MADE IT IN EPISODE 3...

‍

Good content to watch together

🔗 How to respond to online colloquial language 1. Refine anomalous corpus data

‍

‍

Editor l Regret Ko Won-hee
‍
wonhee.go@twigfarm.net

‍

✏️콘텐츠 번역&현지화, 한 곳에서 해결하세요.

• 최신 AI기술이 탑재된 번역기 체험하기(클릭)
• 월간 소식지로 더 많은 이야기 읽어보기 💌

‍

View all blogs

View featured notes

LETR note
Introducing the Universe Matching Translator and AI Dubbing Technology
2025-06-30
WORKS note
Leveraging VTT Solutions for Video Re-creation
2025-06-27
LETR note
Comparing Google Gemini and LETR WORKS Persona chatbots
2024-12-19
User Guide
Partnership
Twigfarm Co.,Ltd.
Company registration number : 556-81-00254  |  Mail-order sales number : 2021- Seoul Jongno -1929
CEO : Sunho Baek  |  Personal information manager : Hyuntaek Park
Seoul head office : (03187) 6F, 6,Jong-ro, Jongno-gu,Seoul, Republic of Korea
Gwangju branch : (61472) 203,193-22, Geumnam-ro,Dong-gu,Gwangju, Republic of Korea
Singapore asia office : (048581) 16 RAFFLES QUAY #33-07 HONG LEONG BUILDING SINGAPORE
Family site
TwigfarmLETR LABSheybunny
Terms of use
|
Privacy policy
ⓒ 2024 LETR WORKS. All rights reserved.