Text Generator
Text Generator > Blog > Text Generator Researches via Crawling

Text Generator Researches via Crawling

Text Generator now includes a large neural network capable of crawling the web for any input links, understanding the world wide web to better generate text.

Any linked html pages or image content is downloaded and analysed to help text generation improve

What does that mean?

Previously unsolved inputs had not enough context for a machine to be able to make an informed decision on what text to generate

example:

What is this about https://t3.ftcdn.net/jpg/02/44/02/20/360_F_244022052_Ev9kjmGn1VWl58TMoNAdVerpR6iFPJrU.jpg ?
++A: This image has been used for many years by the Fictional Fairies.  It was created to be included on their book \"The Magic Circle\" (a collection containing all sorts of magical items from various fandoms) which they published with Artifact Press.

Why Web Crawling?

To generate text you need enough context to understand the intent/patterns to follow, not enough context and generated text can go off topic, regurgitate training data or produce other surprising outputs

Many links provide rich context for how to respond to questions but need deeper analysis, images and gifs posted during discussions are important to deeply understand before replying.

How does this work?

If there isn't enough information in a prompt, the content of any linked images or web pages is analysed with neural networks to determine how the link is relevant to the text/code to be generated

New Use Cases

Multimedia conversation/chat understanding

Links posted in web chat can be followed, text generator responds in appropriate ways given understanding of emojis posted

emoji posted was a happy emoji

feeling https://static.text-generator.io/static/img/h.png because
++they feel like it was made by someone who is very nice.

emoji posted was a sad emoji

feeling https://static.text-generator.io/static/img/s.png because
++you have to click on the button for each frame and then select the image which appears in this place, but I don't know how many frames there are or where it is located.

Code Autocomplete based on image content

Code autocomplete is relevant to the images content being worked on

<img src="https://static.text-generator.io/static/img/pic.jpg" title="
++a man and woman sitting on a tree branch\">

Link understanding

Bill: Checkout http://meta.stackoverflow.com/q/332386/497418 WDYT?
Martin:
++I think it would be better to have a tag for each language, and we could then use a language tag on the question, and a language tag on the answers. This would avoid some of the confusion about the language used in a question or answer, and allow us to have the same tags in the questions and answers.
++Martin: I would like to see an extension to the question/answer system that would allow you to have multiple languages in your question or answer, and then link to the language tag

Image captioning

https://static.text-generator.io/static/img/science-background-dark.png captions for this image are:
++1. A network with spheres and dots on a black background

Multiple image understanding

Which fairy do you think is most feminine?
1: https://static.text-generator.io/static/img/fairy1.jpeg
2: https://static.text-generator.io/static/img/fairy2.jpeg
3: https://static.text-generator.io/static/img/fairy3.jpeg
++A: A pink fairy in a pink dress with blue wings.

Autocomplete for technical and artistic discussions


Do you you think this should be a good or bad character in a kids movie and why?
https://static.text-generator.io/static/img/fairy1.jpeg
++A: It should be a good character because in the movie, the fairy helps the little girl to get away from the bad people.
++B: I don't think it's a good character because it can make the girl fall in love with the fairy.
++C: I think it's a good character because it is a good character to have in a movie.
Do you you think this should be a good or bad character in a kids movie and why?
https://static.text-generator.io/static/img/fairy3.jpeg
++A: It can be a good or bad character. It depends on the movie.
++B: I think the gargoyle figurine in a blue outfit with a gold crown on its head is a good character.

And much more!

Future work

In future other content types like video and audio will be researched as well as other entities than just links e.g. people, places, organisations, topics.

Link research is not yet enabled for embeddings, but it can be enabled so multiple links, images, multilingual text and code can be embedded in the same space, then searched and analysed.

Our crawler will get more advanced to execute javascript to better understand web pages as well as login to understand gated communities

Distant future

In general in future a lot of prompt engineering inputs and interpretation of outputs will be automated and improved to make the text generation process less error prone

General AI systems will do whatever it takes to answer questions including researching, crawling, training new networks or inventing new ideas

Theres lots more to go but for now combining image and link understanding and text generation is an important step towards AGI

Caveats

Crawling and image analysis adds processing time to your request, if you wish to turn this off remove the input http(s) protocol from links, so we can not detect any images in the text

Plug

Text Generator offers an API for text and code generation. Secure (no PII is stored on our servers), affordable, flexible and accurate.

Note with Text Generator you aren't charged for tokens and can generate up to 100 examples in a single request which adds to diversity of results.

Try examples yourself at: Text Generator Playground