Other Projects

Elliphant: A Machine Learning Method for Identifying Subject Ellipsis and Impersonal Constructions in Spanish. Master Thesis.

University of Wolverhampton and Universitat Autònoma de Barcelona. 

Advisors: Prof. Ruslan Mitkov and Prof. Xavier Blanco.

Awarded by the Second prize at IV MAVIR 2010 for Master Thesis on “Language Technologies applied to Intelligent Systems for Multilingual and Multimedia Information Access” funded by Corex, Daedalus, iSOCO and in collaboration with Fundación de la Universidad Europea de Madrid.


Abstract: This thesis presents Elliphant, a machine learning system for classifying Spanish subject ellipsis as either referential or non-referential. Linguistically motivated features are incorporated in a system which performs a ternary classification: verbs with explicit subjects, verbs with omitted but referential subjects (zero pronouns), and verbs with no subject (impersonal constructions). To the best of our knowledge, this is the first attempt to automatically identify non-referential ellipsis in Spanish.


Download              Elliphant’s Comparable Corpora (ESZIC_ES and ESZIC_PT)

— Explicit Subjects, Zero-pronouns and Impersonal Constructions (ESZIC) annotated.

— Both corpora have texts from two genres:  legal (laws) and health (psychiatric papers).

ESZIC_ES Corpus (zip)

        —  Spanish.

        — Parsed by Connexor’s Parser.

ESZIC_PT (zip)

    —  Brazilian Portuguese.
    — Parsed by PALAVRAS Parser.


La relación entre fonética y fonología

(Relationship between Phonetics and Phonology)

B.A. Linguistics course: Phonetics and Phonology.

Resumen: Es una presentación exahustiva de los puntos de contacto entre ambas disciplinas desde los orígenes de la fonología en la Escuela de Kazán hasta las revisiones de The Sound Pattern of English: teoría de la marcación, fonología generativa natural, fonología autosegmental, fonología léxica y teoría de la optimicidad.

Generativismo, de lo transformacional a lo computacional

(On Generative Grammar: from transformational to computational)

B.A. Linguistics course: Syntax.

Resumen: Este trabajo es cuaderno de consulta para poder sitúar en el tiempo los conceptos de la gramática generativa. Cada término está definido y cada revisión de la teoría está ilustrada con un gráfico que la resume. Se recoge desde los orígenes del generativismo desde las ideas de la gramática de Port Royal hasta la Gramática Léxico-Funcional y la de Frase Generalizada.


CORAL: Construcción de Objetos Representacional Aplicada al Lenguaje

(CORAL: Construction of Objects Representational Applied to Language)
Undergraduate Research Project and B.A. essay.

University Complutense de Madrid. Advisor: Eduardo Basterrechea.

Awarded by the Special Award “GESFOR GROUP. Information Technologies” and Honorable Mention in the VI Arquimedes National Contest of Introduction to University Research, Spanish Ministry of Education and Science, 2007.

Resumen: CORAL es un modelo teórico pensado para extraer información semántica a partir de patrones sintácticos frecuentes en español. Está información semántica se crea a partir de la combinación de la morfología y de la sintaxis. El modelo combina ideas procedientes de diferentes paradigmas lingüísiticos. 

      So many colors

Términos de color en español: semántica, morfología y análisis lexicográfico
(Spanish Color Terms: Semantic, Morphological and Lexicographic Analysis)
Undergraduate Research Project

University Complutense de Madrid. Advisor: Prof. Fernando Lázaro Mora.

Collaboration Fellowship for Undergraduate Students, Department of Linguistics, Romance and Slavic Philology, Universidad Complutense de Madrid, 2007-2008.


Abstract: In this study we present the patterns of semantic variation through a categorization of the affixes of Spanish terms for colors. For the categorization we take into account (i) the grammatical class of the derived color term, (ii) the grammatical class of the derivational basis, (iii) the semantic features of the derivational basis, ± color and ± primary color, (iv) the affix type and (v) the variation of meanings in the affix. We present the analysis of the lexical and the morphological relations of these terms as well as the definitions for the color term’s affixes. We compiled a corpus with all the Spanish terms for colors (563) and their definitions taken from the dictionaries DRAE, DUE, DEA and CLAVE.

Download              Spanish Color Names Corpora

All Spanish Color Names Corpus (pdf)

    — Manually Extracted from the Spanish Royal

        Academy Dictionary.
        (Yes, I read the whole dictionary.)

    — The list is completed with terms from other       
        four dictionaries.

Lists of specific color terms used:

    — for horses and livestock (pdf)

    — for painting (pdf)

    — for persons (pdf)

    — in heraldry (pdf)

    — in poetry (pdf)

    — for specific domains (pdf)


In the last three years the main projects I’ve been involved in are DysWebxia and Dyseggxia.

Bellow are other finished projects: