Deep learning has attracted a great deal of attention as a result of image recognition and speech recognition.
In the field of natural language processing, utilization of deep learning is delayed, but one of the causes is presumed to be the difference in quality of input data.
Input data for image recognition and speech recognition consists of continuous values, but the input data of natural language processing is largely different in that it consists of discrete values. In the case of Japanese, it is composed of letters such as "a" and "i", and the closeness between letters is not defined.
Therefore, the tasks are as follows.
· How can I input sentences of arbitrary length?
· How to express words, phrases, sentences?
The keywords to solve the above problem are as follows.
- Vector Space Model, VSM
- word2vec
- Recursive Autoencoder
- Recursive Neural Network