1 / 20

Modeling Verb-Adverb Distributions in a Multidimensional Space

Explore constructing a multidimensional space embedding English verbs & adverbs, analyzing word proximity, compatibility, & individuality. Determine distance, sphericity, and distribution, highlighting key verb properties.

Download Presentation

Modeling Verb-Adverb Distributions in a Multidimensional Space

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Distributions of verbs and adverbs on the multidimensional sphere Vvedensky V.L. Kurchatov Institute

  2. Direct contact with technical devices using human speech implies ability of the computer to understand in some way the idea of the message. The people feel semantic similarity of words and can find the word with close or just opposite meaning. How we can implement this feeling of sense in the computer?

  3. The meaning of words have to convey interpreters into the other language. Their experience is accumulated in the dictionaries which can be used for the analysis of words. One can determine proximity of words and can construct the mathematical space imbedding these words in accordance with their meaning.

  4. Участок пространства близости английских глаголов. The space imbedding the set of English verbs with close meaning. It is constructed using their translations into many languages.

  5. This technique is rather cumbersome, which is difficult to apply for all the verbs of a language. In order to proceed reasonably quickly, more handy approach should be used for the construction of the space imbedding words. It would be more convenient if the data of only one language were sufficient for this procedure.

  6. Each verb can be attributed with the set of properties. The adverbs, which can be readily used together with this verb in many different contexts, reflect these properties. One can смело прыгатьto jump bravelyловко прыгатьto jump deftlyвозбужденно прыгатьto jump excitedly

  7. The language allows only certain combinations of verbs with adverbs though the logic behind matching verbs and adverbs is sometimes quite peculiar. One can горячо споритьargue hotly, though can not горячо видетьsee hotly, and Russians do not sayгорячо кипеть, although sometimes the blood boils hotly in English.

  8. In Russian there are about 1900 words used as adverbs. For each verb one can say whether it is compatible with any of those adverbs. Since the majority of language objects are fuzzy the compatibility measure should be selected from the range [0,1]

  9. Compatibility values for representative sets of 25 verbs and 25 adverbs of Russian language

  10. The set of compatible adverbs is specific for every verb and reflects the individuality of the word. The verbs with close meaning always have about 4% of adverbs for which we can say that they can be used with one verb and not with another. There is a certain distance between different verbs and each verb occupies a certaincell in the multidimensional space, which can be called the “mental space of verbs”.

  11. Using the table of compatibility one can calculate the distance between any two verbs. Each verb finds definite place in the «mental space» in accordance with these distances . There is no need to use all the adverbs for this procedure, just 100 adverbs from the properly selected «basic reference set» are sufficient to determine accurately enough these distances.

  12. Each verb is presented as a dot. The numbers indicate the order of dimensions. Each 2-dimensional plane shows distributions for the pair of dimensions with numbers indicated for rows and columns. The uppermost square represents projection on the plane in dimensions “one” and “two” or X and Y. The scale is normalized to unity 2D projections of 900 verbs in first 9 dimensions of the multidimensional space

  13. Sphericity of the space imbedding verbs Chineese paper lantern The verbs in the multidimensional space are nearly equidistant from a point lying between two hypothetical verbs - “universal”, compatible with any adverb, and “individual”, which can not be used with adverbs. Slight nonsphericity is observed.

  14. The regular distribution of verbs makes feasible accurate selection of the “representative set” of verbs, which fill evenly the observed distribution - this is one example: предлагать, начинать, изменять, направлять, вынимать, возвращать, обвинять, просить, помогать, убирать, указывать, решать, соблюдать, разрешать, преодолевать, приносить, уменьшать, поворачивать, проникать, закрывать, работать, нападать, втягивать, приказывать, хранить, отталкивать, выставлять, покрывать, обещать, носить, попадать, определять, прикреплять, входить, вставать, скрывать, освобождать, приходить, беречь, очищать, выходить, судить, пересекать, кидать, перевозить, служить, лгать, уступать, возникать, привязывать, завязывать, спасать, избегать, побеждать, обнаруживать, беспокоить, запоминать, отвлекать, узнавать, расти, выручать, шутить, возить, бегать, терять, рассыпать, отсутствовать, жить, любить, мучить, катать, выдерживать, зависеть, летать, отдыхать, стучать, обманывать, плавать, блуждать, будить, грабить, бывать, праздновать, дарить, дружить, радовать, умирать, успевать, блестеть, воровать, восхищать, спать, баловать, уставать, болеть, сушить, уважать, плевать, перекашивать, бесить. Representative set of 100 verbs

  15. 900 verbs and 600 adverbs of Russian language in first two dimensions. The distributions nearly mirror each other so that the most compatible verbs and adverbs are close, while incompatible pairs are on the far ends. Sets of verbs compatible with three adverbs are shown as hazel points. Green points indicate fuzzy cases. Distributions of verbs and adverbs in first two dimensions of the multidimensional space To cost, to fill over, to throw down, to continue patiently, industriously, motionlessly

  16. This is an example of the “representative set” of adverbs, which fill evenly the corresponding distribution: неизменно, предусмотрительно, неожиданно, часто, редко, открыто, независимо, уверенно, легко, охотно, дружно, усердно, покорно, торопливо, много, незаметно, старательно, радостно, хитро, медленно, обоснованно, бодро, забавно, тайно, невозмутимо, небрежно, грустно, безнаказанно, твердо, весело, беззаботно, мужественно, поспешно, нерешительно, нетерпеливо, точно, загадочно, дополнительно, невольно, справедливо, резко, основательно, безошибочно, преданно, воинственно, безотказно, смешно, верно, четко, безупречно, чрезмерно, скрытно, неловко, тщательно, непреклонно, ненадежно, полезно, сложно, молчаливо, понятно, насильно, тяжело, устало, оживленно, кропотливо, придирчиво, неудобно, отчётливо, непримиримо, осуждающе, враждебно, безвольно, чудесно, неодобрительно, чисто, развязно, добротно, громко, кратко, задумчиво, подробно, ласково, неограниченно, болезненно, бесперебойно, доверчиво, приветливо, счастливо, глубоко, трезво, проникновенно, сердечно, смутно, сжато, густо, смертельно, дорого, звонко, незыблемо, пристально. Representative set of 100 adverbs

  17. The general layout of distributions suggests the idea that the verbs and adverbs are presented in two adjacent portions of cortical tissue. These areas are rich with internal connections, and provided with multiple crossing links. Density of these links falls out quickly with the distance - only few connections link far ends of these cortical patches. The presence or absence of such a link makes possible or prohibits combined use of a verb with a certain adverb in a fluent intelligent speech.

  18. Brain studies, using special functional staining techniques, reveal confined areas performing definite tasks. These portions of the monkey cortex extract visual features with certain orientation. They are about 1 mm x 2 mm large. Rich internal connections can impart this cortical patch with properties reflected as additional dimensions. That is like blowing up the flat soap film making the bulb in the space with multiple dimensions. Small cortical patches with a certain function D.H.Hubel, Eye, Brain, and Vision

  19. Our results show that the verbs and adverbs of human language are closely and strictly mathematically interrelated. Our data on nouns and adjectives indicate that this holds for them also. We believe that the rules controlling compatibility of words and the number of words in the language are determined by the layout of the space imbedding these objects of human speech. We see the way, how this abstract “mental space” can find material substrate in the cortex of the human brain.

  20. To be continued…

More Related