1. Title: DEtection of TOXicity in comments In Spanish (DETOXIS) 2. Organizers: Mariona Taulé mtaule[at]ub.edu Montserrat Nofre montsenofre[at]ub.edu Alejandro Ariza alejandro.ariza14[at]ub.edu Enrique Amigó enrique[at]lsi.uned.es Paolo Rosso prosso[at]dsic.upv.es 3. Corpus Description: https://detoxisiberlef.wixsite.com/website/corpus 4. Number of training instances: 3463 5. Number of attributes (including target labels): 21 6. Attribute information: a) topic (categorical attribute with three possible values: CR = crime, MI = migration, SO = social) b) thread_id (identifier of each conversation thread) c) comment_id d) reply_to (identifier of the previous comment in the thread. It defaults to itself in case it is the first comment in the thread) e) comment_level (categorical attribute with two possible values: 1 = direct comment to an article, 2 = reply to a comment) f) comment g) argumentation h) constructiveness i) positive_stance j) negative_stance k) target_person l) target_group m) stereotype n) sarcasm o) mockery p) insult q) improper_language r) aggressiveness s) intolerance t) toxicity (TARGET attribute with two possible values: 0 = not toxic, 1 = toxic) u) toxicity_level (TARGET attribute with four possible values: 0 = not toxic, 1 = mildly toxic, 2 = toxic, 3 = very toxic) Attributes g)-s) are binary features with value 0 if it is not satisfied given the respective comment or 1 otherwise. Furthermore, this set of features together with the topic attribute will only be available during training. The test dataset will only contain the following attributes: [thread_id, comment_id, reply_to, comment_level, comment]. There are no missing values in this dataset. 7. File format: both train and test files will be provided in csv format where each field is comma separated. 8. Dataset requirements: a) Please fill in the following registration form to receive the dataset password: https://forms.gle/3bMHrPDEuPRQ7t7R7 b) By participating on this competition, you agree to the following Terms and Conditions: https://tuit.cat/nZ1eq