DEEP TEXT

IDEX | Technical demonstration powered by hyperbase.unice.fr linguistics web tool

Using deep learning to do indentification tasks on textual data (authorship detection, sentiment analysis, ...) is a well-known application. Neuronal networks are efficient on those tasks but black box oriented. The goal of this project is to open the box and gives us as many tools as possible to understand the underlying mechanisms such as linguistic pattern detection, occurrences, ... 


We provide here three different languages (English, Latin, French), where you can observe how AI makes its decisions.


Sentiment analysis



Rotten Tomatoes dataset (English)

Sentiment classification based on a collection of short review excerpts from Rotten Tomatoes collected by Bo Pang and Lillian Lee (ACL 2005). Specifically: 5331 positive snippets - 5331 negative snippets


How it works :

- Enter your text inside the text box on the right
- Click on "Positive or Negative" button
- See on which linguistic marks the AI makes its decision.

Enter your text



Authorship detection





L.A.S.L.A dataset (Latin)

The LATIN database includes a selection from the classical Latin texts processed by the LASLA - Laboratory (Laboratory for statistical analysis of ancient languages - University of Liège - Belgium). The corpus is divided into 22 authors such as Caesar, Cicero, Livius, Seneca, ...


How it works :

- Enter your text inside the text box on the right
- Click on "Who is it ?" button
- See on which linguistic marks the AI makes its decision.

Enter your text



Political detection



Left-Right political spectrum detection (French)

The terms "left" and "right" appeared during the French Revolution of 1789 when members of the National Assembly divided into supporters of the king to the president's right and supporters of the revolution to his left. Today the left–right political spectrum is a system of classifying political positions, ideologies and parties, from equality on the left to social hierarchy on the right.


How it works :

- Enter your text inside the text box on the right
- Click on "Left or Right" button
- See on which linguistic marks the AI makes its decision.

Enter your speech



Publications

  • "Text Deconvolution Saliency (TDS): a deep tool box for linguistic analysis" (L. Vanni, M. Ducoffe and al.), 56th Annual Meeting of the Association for Computational Linguistics (ACL), Jul 2018, Melbourne [hal-01804310]
  • "ADT et deep learning, regards croisés. Phrases-clefs, motifs et nouveaux observables" (L. Vanni, D. Mayaffre and D. Longrée), in D. Iezzi et al. (dir.) JADT’ 2018, UniverItalia, Rome, 2018, pp. 459-466. [hal-01823560]
  • "Les mots des candidats, de « allons » à « vertu » " (D. Mayaffre, C. Bouzereau, M. Ducoffe, M. Guaresi, F. Precioso et L. Vanni), in Pascal Perrineau (dir.). Le vote disruptif. Les élections présidentielle et législatives de 2017, Paris, Presses SciencesPo, 2017, pp.129-152 [hal-01635941]
  • "Machine Learning under the light of Phraseology expertise: use case of presidential speeches, De Gaulle -Hollande (1958-2016)" (M. Ducoffe, D. Mayaffre, F. Precioso, F. Lavigne, L. Vanni, A. Tre-Hardy), in Damon Mayaffre et al. (dir.) JADT 2016, Université Nice Sophia-Antipolis, 2016, pp. 157-168. [hal-01343209]

Future publications

  • Collectif, L’intelligence artificielle des textes, Paris, Honoré Champion
  • D. Mayaffre, Macron. L’intelligence artificielle, Paris, Belin

Contact



Scientific committee : C. Bouzereau, M. Ducoffe, M. Guaresi, D. Longree, D. Mayaffre, S. Mellet, F. Precioso, L. Vanni