18th century and the French Encyclopedia

In the 18th century the middle class started to grow. People who belonged to this new class used to be wealthy and to spend their free time (most of it) on leisure activities.

Reading was a common way of spending the time, and quickly, the literature science became important in the society of the times. Middle class people used to spent their money on buying books or on protecting the authors so as they could keep on with their works.

The French Encyclopedia was created in these times, so as to make people aware of the concepts and ideas that the Enlightment had brought. It was created by Denis Diderot and Jean Le Rond d’Alambert, and made up of 17 volumes.

Apart of the encyclopedia and the books of the times, in the 18th century some periodicals were born: The Tatler and The Spectator were the most important.

________________________________________________________

En el siglo 18 la clase burguesa comenzó a crecer, convirtiéndose en la clase más influyente del momento. La gente perteneciente a la burguesía era rica y poderosa, y dedicaba su tiempo libre (la mayor parte) a actividades de ocio.

Leer era una de las actividades de ocio más comunes, y rápidamente la literatura se convirtió en el fenómeno cultural más importante del siglo. Los burgueses utilizaban su riqueza para adquirir volúmenes y proteger a los autores de modo que pudieran progeguir con sus trabajos (mecenazgo).

La Enciclopedia Francesa fue creada en éste siglo, para permitir a la gente empaparse de los conceptos e ideas que la Ilustración había traído. Sus autores fueron Denis Diderot y Jean Le Rond d’Alambert, y estaba formada por 17 volúmenes.

Aparte de las publicaciones mencionadas, el s.18 fue también el marco de las primeras publicaciones periódicas.  ‘The Tatler’ y ‘The Spectator’ son los ejemplos más importantes.

QUESTION ANSWERING (Q3)

        Question Answering (QA) system is a way of information retrieval by means of computing. It requires more complex natural language processing (NLP) techniques than other types of information retrieval such as information extraction.

overview1

        In the 1960s the first two QA systems were developped: LUNAR (it answered questions about the geological analysis of rocks returned by the Apollo moon missions) and BASEBALL (it answered questions about the US baseball league).

        A posed question in natural language (easily understandable) is answered by a machine. It retrieves the information in two different ways:

- By looking it up in a pre-structured database that lies inside it or a collection of articles that are also inside the system.

- By searching in a very open web, like the world wide web.

        The posed questions can belong to an open-domain (can refer to nearly everything) or to a closed-domain (refer to a specific subject, such as medicine or flowers, for example). The system looks for the answer by analysing the words of the question (key-word based techniques); but in some cases this way of finding the correct answer is not enough, and here different and complicated methods will be used: These techniques might include named-entity recognition, relation detection, coreference resolution, syntactic alternations, word sense disambiguation, logic form transformation, logical inferences (abduction) and commonsense reasoning, temporal or spatial reasoning…

          Ryan McCabe (University of Massachusetts, Amherst) and M. Chase Smith (Amherst College) have created a Question Answer web page, that explains the techniques used, its history, and  so much interesting information about QA.

       Some interesting QA systems are the ones that follow:

  • Answers.com : it is a well-known open-domain QA system
  • Ask.com: when a question is posed, this QA system redirects you to several pages that contain some of the items you asked on, so it works as a searcher more than as an advanced QA page.
  • Semote: it also redirects the answerer to some different issues found on the internet when a question is posed, but the results are very accurated.
  • Qualim: works the same way as Semote.
  • Anna (in IKEA): it is a closed-domain QA that brings the answerer to the correct section, where what he/she needs can be found.

 

REFERENCES

- Question answering. (2009, June 13). In Wikipedia, The Free Encyclopedia. Retrieved 19:00, June 21, 2009, from http://en.wikipedia.org/w/index.php?title=Question_answering&oldid=296213100

- Question Answering. By Ryan McCabe (University of Massachusetts, Amherst) and M. Chase Smith (Amherst College). Retrieved 19:11, June 21, 2009, from http://ciir.cs.umass.edu/REU/2000/REUpres/QAFinal_files/frame.htm

- Learning surface text patterns for a Question Answering system (2001). Written by Deepak Ravichandran (University of Southern California, Marina del Rey, CA) and Eduard Hovy (University of Southern California, Marina del Rey, CA. Retrieved 20:22, June 21, 2009, from http://portal.acm.org/citation.cfm?id=1073092

- The TREC-8 question answering track report. By Ellen M. Voorhees (National Institute of Standards and Technology, Gaithersburg). Retrieved 20:36, June 21, 209, from http://66.102.1.104/scholar?hl=es&lr=&q=cache:eGBdGvbpDh4J:citeseerx.ist.psu.edu/viewdoc/download%3Fdoi%3D10.1.1.38.6392%26rep%3Drep1%26type%3Dpdf+question+answering

MACHINE-AIDED HUMAN TRANSLATION AND MACHINE TRANSLATION (Q2 and Q3)

       In the 17 th century and thanks to René Descartes, who thought about connecting two words in different languages to a unique symbol, and later in the 20 th century with the development of the Georgetown experiment; the translation issue was born.

        Machine Translation (MT) is used to translate texts from one language to another. This process is put into practise by means of a machine, there is no human translation, only in the creation of the machine: the translator will be ‘taught’ the connection existent between a universal symbol and its transcription in different languages. This task is complicated due to the ambiguity of human languages.

machine translation

        Some important examples of machine translation are: Lucy, Systran, ProMT, OpenTrad, Google translate, Yahoo! Babel fish

       If  Machine Translation is carried out by a machine, altough some human aid does exist in the programming of the translator and in the preediting and postediting, Machine-Aided Human Translation (MAHT)  is the opposite way. In MAHT, also called Computer-Aided Translation(CAT), a human performs the translation, with the support offered by computer tools.

        Three types of MAHT do exist, according to the different type of users:

  • Specific Software Environments designed for Professional Translators Working In Teams: it is the case of competent translators working in teams and connected with a local network. They count on workstations which offer them tools (integrated in the text processor)to access a bilingual terminology and a translation memory and to submit parts ot the text to an MT server. These tools the professional translators use are: Trados (MultiTerm), IBM (Translation Manager),  SITE-EuroLang (EuroLang Optimizer).
  • Environments for Independent Professional Translators: translators are freelance and they are asked to present the translate text in the same format of the source documents. They use tools such as: Mercury/Termex  by LinguaTech, a resident program for PCs, and WinTool.
  • Tools for Occasional Translators: they are helped by dictionaries, conjugators, style checkers… they don’t work with translation memory. Tools: SISKEP, Ambassador by Language Engineering.

 

REFERENCES

Machine translation. (2009, May 16). In Wikipedia, The Free Encyclopedia. Retrieved 17:20, May 19, 2009, from http://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=295950237

- ‘Machine-aided Human Translation’  by Christian Boitet, Université Joseph Fourier, Grenoble, France. Retrieved 17: 34, May 19, 2009, from http://cslu.cse.ogi.edu/HLTsurvey/ch8node6.html

‘Machine Translation: The Disappointing Past and Present’ by Martin Kay, Xerox Palo Alto Research Center, Palo Alto, California, USA. Retrieved 17:37,May19,2009,from  http://cslu.cse.ogi.edu/HLTsurvey/ch8node4.html#SECTION82

- Computer-assisted translation. (2009, May 1). In Wikipedia, The Free Encyclopedia. Retrieved 17:39, May 19, 2009, from http://en.wikipedia.org/w/index.php?title=Computer-assisted_translation&oldid=294759999

PARSING (Q2)

        Parsing is making a syntactic analysis. A sequence of tokens is analyzed by means of a hierarchical structure, such as a parse tree.

parsing2

         In some translation systems parsing is done by computer programs. It is not easy to translate sentences, because of the ambiguities of the human language; so as to create a parsing system, experts have to take into account all the gramatic posibilities. As a machine is not able to translate with total fidelity, the system used in translators and parsers is based on statistics, which are created by human language investigators who work on specific contexts.

        The most common use of parsers is as a component of a compiler or interpreter, and their language memory is based on  a context-free grammar because it is easier to work with sentences seeing them out of any specific context. In the book ‘Parsing Techniques: a practical guide’ another important use of parsers is described: ” contribute to all existing software: they enable Web browsers to analyze HTML pages and PostScript printers to analyze PostScript, and some of the more advanced techniques are used in code generation in compilers and in data compression. Also their importance as general pattern recognizers is slowly being acknowledged“.

        In the  dept.  of  Computer and  Information Science in the University of Pennsylvania, Michael Collins has proposed three new parsing models:

  • MODEL 1: it is an improvement on the usual parsing models created by now.
  • MODEL 2: the parser is extended to make the complement/adjunt distinction. This is done by the addition of probabilities over subcategorisation frames for head-words.
  • MODEL 3: a probalilistis treatment of wh-movement is given, and it consists of a derivation of  the analysis in ‘Generalized Phrase Structure Grammar’.

 

REFERENCES

-Parsing. (2009, April 7). In Wikipedia, The Free Encyclopedia. Retrieved 16:55, April 9, 2009, from http://en.wikipedia.org/w/index.php?title=Parsing&oldid=296074492

- ‘Three Generative, Lexicalised models for Statistical Parsing’ by Michael Collins, dept. of Computer and Information Science, University of Pennsylvania, Philadelphia, USA. Retrieved 18:22, April 9, 2009, from http://www.aclweb.org/anthology-new/P/P97/P97-1003.pdf

- Diazdesantos.es on ‘Parsing techniques: a practical guide’ . Retrieved 17:33, April 9, 2009, from http://www.diazdesantos.es/libros/grune-dick-parsing-techniques-a-practical-guide-L0490401119111.html

- ‘Parsing techniques: a practical guide’ (2008, February 1) by Dick Grune. Retrieved 17:40, April 9 , 2009, from http://books.google.es/books?hl=es&lr=&id=05xA_d5dSwAC&oi=fnd&pg=PR5&dq=%27Parsing+techniques:+a+practical+guide%27&ots=3MuxaHl6L9&sig=t9UqCX42lglNF-KRIxY6Q_R_dug

RESEARCH TOPICS ON HUMAN LANGUAGE TECHNOLOGY (Q2)

INFORMATION EXTRACTION

- Named Entity Recognition (NER)

INFORMATION RETRIEVAL

- Clustering

LANGUAGE ANALYSIS

- Parsing

LANGUAGE UNDERSTANDING

- Pragmatics

KNOWLEDGE REPRESENTATION AND DISCOVERY

- Semantic web

SPOKEN LANGUAGE INPUT

- Speech recognition

MULTIMODALITY

- Modality integration: Facial movement, Representations of Space and Time, and Speech.

MULTILINGUALITY

- Machine Translation

- Machine-Aided Human Translation

 

REFERENCES

-Named Entity Recognition. (2009, March 25). In Wikipedia, The Free Encyclopedia. Retrieved 11:10, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Named_entity_recognition&oldid=279521544

-Colaboradores de Wikipedia. Cluster (informática) [en línea]. Wikipedia, La enciclopedia libre, 2009 [fecha de consulta: 18 de marzo del 2009]. Disponible en <http://es.wikipedia.org/w/index.php?title=Cluster_(inform%C3%A1tica)&oldid=24900365>.

-Cluster (computing). (2009, March 20). In Wikipedia, The Free Encyclopedia. Retrieved 11:14, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Cluster_(computing)&oldid=278511256

- Parsing. (2009, March 11). In Wikipedia, The Free Encyclopedia. Retrieved 11:18, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Parsing&oldid=276620765

- Pragmatics. (2009, March 20). In Wikipedia, The Free Encyclopedia. Retrieved 11:25, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Pragmatics&oldid=278540321

- Semantic Web. (2009, March 20). In Wikipedia, The Free Encyclopedia. Retrieved 11:31, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Semantic_Web&oldid=278574575

- Speech recognition. (2009, March 19). In Wikipedia, The Free Encyclopedia. Retrieved 11:39, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Speech_recognition&oldid=278372461

- Machine translation. (2009, March 23). In Wikipedia, The Free Encyclopedia. Retrieved 11:43, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=295950237

- Computer-assisted translation. (2009, March 19). In Wikipedia, The Free Encyclopedia. Retrieved 11:54, March 25, 2009, from http://en.wikipedia.org/w/index.php?title=Computer-assisted_translation&oldid=294759999

HUMAN LANGUAGE TECHHNOLOGY RESEARCHERS (Q1)

        One of the most important researchers on Human Language Technologies is Martin Kay. He was born in Great Britain and he received his M.A. from Trinity College, Cambridge, in 1961. His woks at Stanford University are very well-known. At this time, he is Professor of Linguistics at Stanford University and Honorary Professor of ComputationalLinguistics at Saarland University. He has made important developements on some subjects that have to do with Human Language Technologies, such as chart parsing, functional unification grammar, phonology, morphology and machine translation.

       Hans Uszkoreit is another well-known authority in the Human Language Technology subject. He is a computational linguist from Germany and works at the moment as a Professor at Saarland University and also as the Scientific Director at the German Research Center for Artificial Intelligence (DFKI).  Apart from all that he works in some enterprises that involve Human Language Technologies, He has written and collaborated with other researchers in many books and articles apart from working in enterprises.

 

REFERENCES:

- Machine translation. (2009, March 14). In Wikipedia, The Free Encyclopedia. Retrieved 11:20, March 18, 2009, from http://en.wikipedia.org/w/index.php?title=Machine_translation&oldid=277284040

- Martin Kay. (2008, June 7). In Wikipedia, The Free Encyclopedia. Retrieved 11:23, March 18, 2009, from http://en.wikipedia.org/w/index.php?title=Martin_Kay&oldid=217746063

- Martin Kay (2009) in Computer Science, at Stanford University. Retrieved 12.26, March 18, 2009, from http://www.stanford.edu/~mjkay

- Hans Uszkoreit Curriculum Vitae (2009). Retrieved 13:04, March 23, 2009, from http://www.coli.uni-saarland.de/~hansu/hucv_eng.pdf

- Hans Uszkoreit Personal Homepage (2009). Retrieved 12:52, March 23, 2009, from http://hans.uszkoreit.net/

- German Research Center for Artificial Intelligence (27.01.2009). Retrieved 13:13, March 23, 2009, from http://www.dfki.de/web

WHAT HUMAN LANGUAGE TECHNOLOGY IS AND SOME RESEACH CENTRES (Q1)

         Language Technology is often called Human Language Technology (HLT) or Natural Language Processing (NLP) and consists of computational linguistics (or CL) and speech technology as its core, but includes also many application oriented aspects of them. Language technology is closely connected to Computer Science and general linguistics.

        According to the Meraka Institute (an African Advanced Institute for Information and Comunication Technology), Human Language Technology makes it easier for people to interact with machines, what can benefit a wide range of people working with computers.

        An important role of  Human Language Technology are  digital libraries. Instead of wasting paper and space by keeping books in traditional libraries such as the ones we have by now, pages on the internet which content different types of books or reading issues in general have started to grow. Another important advantage of all this are the facilities we’re given: searching for information on the internet is always easier than searching for it on paper.

        Apart from Meraka Institute, another important centre of research is HLTC, a multidisciplinary research place at the Hong Kong University of Science and Technology (HKUST) whose mission is to lead state-of-the-art research directions that drive the development of new applications in both text and spoken language technology. HLTC is led by six faculty members from the EEE and the CS departments, and some Systems built at HLTC include automated language translation for the Internet, speech-based web browsing, and speech recognition for the telephone.

         The definition given by the  ‘Language  Technology World’ group is the one that follows: “Language technology (LT), also Human Language Technology (HLT), is the cover term for all information technologies specialized for dealing with text and speech in human language. It is also the field of engineering in which LT methods and applications are developed.” And as it’s seen as a fields of engineering, nowadays it is very commonly investigated, so as to make deeper the knowledge of all these new technologies we’re learning how to deal with.

 

REFERENCES:

 -Human Language Technology. (2007). In Meraka Institute. Retrieved 12:09, February 25, 2009, from http://www.meraka.org.za/humanLanguage.htm

-Language technology. (2008, April 1). In Wikipedia, The Free Encyclopedia. Retrieved 11:17, February 25, 2009, from http://en.wikipedia.org/w/index.php?title=Language_technology&oldid=202607020

-Human Language Technology (2002) in “Using Human Language Technology for Automatic Annotation and Indexing of Digital Library Content ” (written  by Kalina Bontcheva, Diana Maynard, Hamish Cunningham, and Horacio Saggion). Retrieved 13:13, March 09, 2009, from http://www.springerlink.com/content/drtjr092ejyk1eyk/

-Natural language processing. (2009, March 5). In Wikipedia, The Free Encyclopedia. Retrieved 11:24, March 16, 2009, from http://en.wikipedia.org/w/index.php?title=Natural_language_processing&oldid=275161330

-Human Language Technology on ‘Language Technology World’ (2009). Retrieved 12:47, March 16, 2009 from http://www.lt-world.org