AI Technology Is Wonderful. Tsinghua Has Made An Artifact To Cure Word Poverty

take 11 minutes to read
Home News Main article

However, I have no culture. A word (Beep -) goes all over the world. Fat friend, I wonder if you've ever had such trouble when you're wandering the Jianghu? Well, there's an artifact here, but it's worth talking about. Listen to me, thank you, because you warm the four seasons? Enter the meaning you want to express in the search box, and then select idioms in the part of speech column. AI can immediately throw you dozens or hundreds of options. The darker the background, the more recommended the system.

If you encounter something you don't understand, click the mouse to view the specific interpretation.

It's not just Chinese. For example, when you want to blurt out a sentence "goose sister chirping", but want to know if there is a more gorgeous Chinese expression, you can get it with one click.

How's it going? Is it convenient enough? Isn't it a little "Mom doesn't have to worry about my poor words anymore" (manual dog head).

"Reverse dictionary" from Tsinghua University

This artifact is called wantwords, a reverse dictionary.

AI behind it has a long history: it was born in the laboratory of natural language processing and social humanities computing of Tsinghua University. The project instructors are Professor Sun Maosong and Associate Professor Liu Zhiyuan. The so-called "reverse" is different from a conventional dictionary. Instead of finding meaning according to words, it gives a description to the dictionary and asks it to help you find words.

The authors introduced in GitHub that they hope the reverse dictionary can play three roles:

Solve the "tip of the tongue phenomenon" when words come to your mouth, but you suddenly can't remember how to say it

Help new language learners

Help dyslexic patients who cannot choose words

The core AI behind this reverse dictionary is called multi-channel reverse dictionary model, and AAAI 2020 has been selected in relevant papers.

Specifically, the multi-channel reverse dictionary model adopts two-way LSTM (bilstm) and attention as the basic framework, and adds four specific feature predictors. Multiple predictors are used to identify the different features of the target words in the input query. On the one hand, the target words with poor embedding quality can be selected through the features. On the other hand, we can also filter out words that are close to the correct target word but have contradictory characteristics.

In other words, AI can choose words more accurately.

In order to make it easier for AI to find the truly "correct" words, in addition to the "internal features" of parts of speech and morphemes, the author also considers the "external features" of hierarchy and sememe.

The so-called hierarchy is used to distinguish whether a word is an entity or a concept. Under the entity, there will be a variety of entities.

In linguistics, sememe refers to the smallest inseparable semantic unit. Linguists believe that the semantic system is applicable in any language and is not related to a specific language.

For example, the word "boy" can be expressed by the three sememes of "human", "male" and "child", while "girl" can be expressed by the combination of "human", "female" and "child".

△ source: HowNet

The new algorithm has been tested and the relevant new system is under development

As mentioned earlier, wantwords reverse dictionary was first born in Tsinghua NLP laboratory and was mainly completed by fan Chao and Lei Zhang in 2019.

When communicating with nutshell, he Fanchao said that at the beginning, they did not promote this project, but the feedback from the students around them was good. Until last November, the project was suddenly popular, and the number of visits soared for a while, crowding out the servers. Since then, wantwords has received more attention and received a lot of advice and technical support from volunteers.

Not only has the web version, wechat applet has been officially launched, but also the app version is under development.

△ wechat applet "wantwords"

According to the latest announcement of the R & D team, before New Year's Eve this year, the new algorithm was also tested and completed, and its performance was significantly improved compared with the original algorithm. In addition to the reverse dictionary, the research team also developed "semantic retrieval and recommendation system of famous words and sentences" and "Chinese word collocation query system".

At present, these two systems have not been opened to the outside world. Interested partners can squat while reading the paper (presented at the end of the article).

By the way, the R & D team also said that wantwords, as an open source project, is always welcome to join us and participate in the design & amp; Develop, propose requirements and feed back questions. If you are interested, go to the official website to make an announcement~

Related papers:

https://arxiv.org/abs/1912.08441

https://arxiv.org/abs/2202.13145

Reference link:

[1 ] official website: https://wantwords.net/

[2 ] nutshell article: https://mp.weixin.qq.com/ s / er-JwST7dUQjMh6VzBE1bA

[3]https://deeplang.feishu.cn/docs/doccnoH9ncCZspo2Ubx79bpZ0Lh#ijyigh

Gigabyte Recalls Aorus Z690i Ultra Mini Itx Motherboard Due To PCIe 4.0 Problem
« Prev 05-15
Weilai Will Be Listed On The Singapore Stock Exchange For The Second Time On May 20
Next » 05-15