Texterro

Text generation system, based on GPT-2. It can generate thousands of fully unigue texts for SEO purposes

Location NDA
Our role Development of:
Proof Of Concept
MVP
Working project
Project Duration 2019—2021
Type of Work RnD

Background

Our client has a very big marketplace with millions of pages with different categories of goods. He had a SEO department and copywriters, who were making SEO texts for those pages. But such manual approach requires a lot of money and is very time consuming. And department had it’s own limitation. People couldn’t write as many unique articles as platform needed so client was searching for ways, how to optimize it.

Tasks and challenges

When GPT-2 was released for public in 2019, we have started to make experiments with it. We have generated first portion of SEO texts and have published them. We were suprized with results. Text were indexed by google and we’ve got positive dynamic in search results positions.Despite of those results, GPT-2 out of the box had a lot of limitations that didn’t allow us to use it in way we want. It had lack of context memory. It was generating 2-3 sentences about computers and on 4th or 5th sentence it started to describe reign of the English queen. It was known limitation of GPT-2. Also, we had to finetune model for our own tasks. But anyway, we made a Proof-of-concept.

One of our first experiments we run on our mining farm.

 

 

Solution (what was done)

We started R&D process. We were playing with other systems like ERNIE, BERT, RoBERTa and different models of GPT-2 (124m, 345m, 774m, 1558m). After series of experiments and finetunig process, we have found a way, how to generate long texts (3 000 symbols / A4 letter long) without loss of context. We have used GAN approach to control quality of generated texts and to filter unwanted content.

We have made an interface for end user, where he could select theme for topic and enter his keywords.

One user could start a batch of ten texts at once and he could monitore status of generation on queue page.

We were using several mining farms with NVidia 1070 and 1080 graphics accelerators for this task.

Below is a text that was generated during one of our experiments.

Generation result from one of our experiments

What the client got (results)

After 7 month of R&D and 3 month of development of MVP, client got system, which could generate thousands of SEO texts for marketplaces in very short period of time.
Each text was unique, we were not using any king of spining technologies like Article Forge use. SEO department got a unique and fully customizable tool for it’s needs.
One of the models was made available as a public Demo and anyone could try it.

The technologies we have used: