Differential Language Analysis ToolKit

DLATK is an end to end human text analysis package, specifically suited for social media and social scientific applications. It is written in Python 3 and developed by the World Well-Being Project at the University of Pennsylvania and Stony Brook University. It contains:

  • feature extraction
  • part-of-speech tagging
  • correlation
  • prediction and classification
  • mediation
  • dimensionality reduction and clustering
  • wordcloud visualization

DLATK can utilize:

DLATK is licensed under a GNU General Public License v3 (GPLv3).

Citations

If you use DLATK in your work please cite the following paper:

@InProceedings{DLATKemnlp2017,
  author =  "Schwartz, H. Andrew
      and Giorgi, Salvatore
      and Sap, Maarten
      and Crutchley, Patrick
      and Eichstaedt, Johannes
      and Ungar, Lyle",
  title =   "DLATK: Differential Language Analysis ToolKit",
  booktitle =  "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
  year =    "2017",
  publisher =  "Association for Computational Linguistics",
  pages =   "55--60",
  location =   "Copenhagen, Denmark",
  url =  "http://aclweb.org/anthology/D17-2010"
}