The responsibility of dialogue
Every time I post something, I wonder if it’s worth the time, the energy needed to structure your ideas and look for the optimal wording. I think it’s worth it though, not for the likes (which I average 15), but because we all owe it to public discourse, in these modern versions of the central square.
We must, like voting, express ourselves eloquently, informed and empathically, we must counter-balance the noise of radical ideologies, we shall debate and build opinions even without being experts, admitting our inevitable ignorance but most importantly, using reasoning and reason.
The archives of these speeches, compiled into terabytes* of text, will be the raw material for the next generation of humanity’s tools, which we are just beginning to glimpse.
AI agents are currently able to create stories, answer questions, proofread and generate code in the main programming languages, translate or stylize paragraphs. Mathematically, the objective that they optimize is (simply) the prediction of the next word given the previous ones**.
None of the above features have been directly implemented. All are emergent from this basic criteria. They are a result of “reading” an enormous amount of texts, books, forums, blogs. This is what gives them power — the source data. And all this source data is made up of billions of mini-speeches, of mini-discourses in the public square of the internet.
The main problem of this collective result is the so-called “alignment” to the principles we would want from it. Broadly speaking, we want it to be:
- useful, to offer practical solutions
- honest, to not fabricate false information
- harmless, to not encourage or cause harm***
This is not trivial. Surprise, surprise, we’ve built ourselves an interactive mirror of modern humanity, but we don’t like what we see in it. The solution is not to eliminate opinions that do not suit us. It is natural for them to exist. A radical opinion is more likely to be put into words than a non-radical one, because that is how we react to stimuli.
The solution is a democratic one. We have to remember the exercise of dialogue. And we have to take care, pay attention and be considerate about how we express ourselves. And do this not for an audience, not for striving for the credits of “likes”. At the end of the day, it’s worth it — even just for yourself, to put your thoughts into words.
* ROOTS Corpus (1.6 TB text consisting of 46 languages and 13 programming languages)
** or the ones around, depending on the type of the model, causal or masked. Also, it’s not actually words, but parts-of-words — tokens
*** Criteria in the InstructGPT paper: Training language models to follow instructions with human feedback