Antibabel multilingual translation software

En résumé (grâce à un LLM libre auto-hébergé)

  • The Antibabel logbook describes an initiative to develop a high-performance multilingual translation software.
  • The project aims to leverage technological advances to handle large amounts of linguistic data.
  • The goal is to launch a movement, not just a technical tool, by exploring non-linear structures of language.

Multilingual Translation Software Antibabel, Logbook

October 3, 2004

I glanced at the page counter. 7,400 visits in less than a week. Maybe the title "Outburst of Anger" intrigued readers. But the response remained weak. If we'd expected 1%, I should have received 74 emails. We're far from that. But that's not the issue. We've made our decision. Something is going to happen. One of us will launch a forum, which he will manage. It must be a working forum, not one for idle chatter. ANTIBABEL is a grand idea, which we are only beginning to explore. But we're going to give it substance. Does this mean that, based on a small team of men, we're going to create a multilingual automatic translation software finally capable of real performance—succeeding where so many others have failed?

Yes and no. It's certain that such a tool represents an enormous undertaking. The goal is to spark a movement. The solution was never Esperanto. I received a message from a reader who simply said:

- But where's the problem? People are speaking more and more English. It's already the standard at airports.

Of course, the world is made up of interconnected airfields—everyone knows that.

I received other comments along these lines:

- But what will you do about Korean?

Good heavens, if we could already achieve something truly effective for the twenty or so "usual" languages used in the West, that would already be a solid start.

Among us, there have been email exchanges forming a fairly exciting brainstorming session. We don't expect passive reactions. The logical response from readers of this section is hesitation. Everyone must be thinking, "But what on earth are they trying to say with this thing?"

It's true that this project doesn't seem to be well understood overall, because it decisively departs from the classical approach of linguists. I'm writing now, aligning characters from left to right, in a single line. My message is linear. But—and the idea isn't mine—within this ANTIBABEL file, ten years old, I had merely taken up a model in which a sentence resembles a molecule, with a structure—either 3D or at least 2D—rather than a linear object. That's the first point.

What readers probably haven't grasped is that we're planning to harness all the possibilities offered by contemporary computing. Think about it. Soon, the central memory of machines will reach values beyond imagination. The same goes for hard drives, thanks to advances in how data-storage elements are packed onto the disk. These external memories will, in an incredibly short future, reach capacities measured in... terabytes, in... millions of megabytes. So we can afford to do anything, to consider anything.

But what exactly are we going to afford? Images, animations, sound. Language will become a fantastic data bank, something no one can today even imagine. I'm anticipating. If one day an ANTIBABEL-type software works, the text portion managing numerous languages will occupy a negligible space in memory—perhaps just one percent. The rest will be taken up by images, animations, and sound sequences.

We're going to search for signals—neither universal nor necessarily so, but common to a large number of Earth's ethnic groups.