Boris Orekhov

I do scientific research and sometimes talk about it in the popular genre. I also write code, both useful and entertaining.

I am interested in patterns over large samples, complete and systematic descriptions of phenomena similar to language grammars. For this purpose, I apply natural language processing tools to text collections. I know and love literature and do not lose sight of its specificity. I understand how modern artificial intelligence is evolving.

Bio

I have a background in humanities, and I am interested in taking a broad view of cultural history and texts. Therefore, I do not develop or improve digital methods for working with texts, but rather utilize existing tools. I devise clever ways to apply them. However, I reserve the right to evaluate how well these tools perform.

In the 2000s, I engaged in the analysis and interpretation of poetic texts in the Russian language. Towards the end of the decade, I learned programming, starting with PHP and then Perl. In the early 2010s, I learned Python and began exploring NLP. Communication with outstanding representatives of computational linguistics greatly aided me. That days I joined RNC Team. Subsequently, my horizons expanded through studies in theoretical sociology, which I began delving into in the 2020s.

In the mid-2010s, as one of the first representatives of humanities, I began experimenting with generating artistic (primarily poetic) texts. This is one of my main interests in the field of AI. My problem lies in my belief that I am the most clever, and no one else besides me understands how artistic texts are constructured. Additionally, I am interested in hidden patterns in chess, linguistic diversity, the Latin language, the aesthetics of libraries and universities.

Digital Humanities

I have been involved in digital humanities research since the first half of the 2010s. Before that time I was also doing it, but I didn’t know it was called that. I have been creating corpuses of texts for linguistic purposes and for humanities research purposes since 2007.

AI

In 2016 I mastered learning recurrent networks, I still find the results that this architecture produces more interesting than what transformers produce. I have trained dozens of models, published them on huggingface, and written several papers in which I try to present my vision of the status of generated texts. I’ve been particularly interested in computer-generated poetry.

Verse studies

Verse studies are quantitative studies of formally described poetic texts. My interest lies in the field of the study of Turkic syllabic verse. I wrote a book about Bashkir verse, where I reviewed everything that has been written about all Turkic traditions.

Soft

I have created several packages for Python, they are either NLP or chess related.