.. _fwflag_lex_interface:
===============
--lex_interface
===============
Switch
======

--lex_interface

Description
===========

Override the argparser in dlatkInterface and send all arguments to lexInterface. lexInterface is often used to upload csv's to MySQL during the LDA process. See the :doc:`../tutorials/tut_lda` tutorial for more details. 

Details
=======

The full list of available flags in lexInterface:

.. code-block:: bash
	
	python lexInterface.py -h

	usage: lexInterface.py [-h] [-f FILENAME] [-g GFILE] [--sparsefile SPARSEFILE]
	                       [--weightedsparsefile WEIGHTEDSPARSEFILE]
	                       [--dicfile DICFILE] [--topicfile TOPICFILE]
	                       [--topic_csv] [--filter] [-n NAME] [-c CREATE] [-p]
	                       [--print_weighted] [--pprint] [-w WHERE] [-u UNION]
	                       [-i INTERSECT] [--super_topic SUPERTOPIC] [-r]
	                       [--depol] [--ungroup] [--compare COMPARE]
	                       [--annotate_senses SENSE_ANNOTATED_LEX]
	                       [--topic_threshold TOPICTHRESHOLD] [-a] [-l]
	                       [--corpus_examples] [--corpus_samples] [-e] [-d DB]
	                       [-t TABLE] [--lexicondb DB] [--corpus_term_field FIELD]
	                       [--corpus_message_field FIELD]
	                       [--corpus_messageid_field FIELD] [--min_word_freq NUM]
	                       [--lexicon_category CATEGORY] [--num_rand_messages NUM]

	On Features Class.

	optional arguments:
	  -h, --help            show this help message and exit

	:

	  -f FILENAME, --file FILENAME
	                        Lexicon Filename (default: None)
	  -g GFILE, --gfile GFILE   
	                        Lexicon Filename in google format (default: None)
	  --sparsefile SPARSEFILE   
	                        Lexicon Filename in sparse format (default: None)
	  --weightedsparsefile WEIGHTEDSPARSEFILE
	                        Lexicon Filename in weighted sparse format (default:
	                        None)
	  --dicfile DICFILE     Lexicon Filename in dic (LIWC) format (default: None)
	  --topicfile TOPICFILE
	                        Lexicon Filename in topic format (default: None)
	  --topic_csv, --weighted_file
	                        tells interface to use the topic csv format to make a
	                        weighted lexicon (default: False)
	  --filter              Allows lexicon filtering if True (default: False)
	  -n NAME, --name NAME  Existing Lexicon Table Name (will load) (default:
	                        None)
	  -c CREATE, --create CREATE
	                        Create a new lexicon table (must supply new lexicon
	                        name, and either -f, -g or -n) (default: None)
	  -p, --print           print lexicon to stdout (default csv format) (default:
	                        False)
	  --print_weighted      print lexicon to stdout (weighted csv format)
	                        (default: False)
	  --pprint              print lexicon to stdout as pprint output (default:
	                        False)
	  -w WHERE, --where WHERE   
	                        where phrase to add to sql query (default: None)
	  -u UNION, --union UNION   
	                        Unions two tables and uses the result as myLexicon
	                        (default: None)
	  -i INTERSECT, --intersect INTERSECT
	                        Intersects two tables and uses the result as myLexicon
	                        (default: None)
	  --super_topic SUPERTOPIC  
	                        Maps the current lexicon with a super topic mapping
	                        lexicon to make a super_topic (default: None)
	  -r, --randomize       Randomizes the categories of terms (default: False)
	  --depol               Depolarize the categories (removes +/-) (default:
	                        False)
	  --ungroup             places each word in its own category (default: False)
	  --compare COMPARE     Unions two tables and uses the result as myLexicon
	                        (default: None)
	  --annotate_senses SENSE_ANNOTATED_LEX
	                        Asks the user to annotate senses of words and creates
	                        a new lexicon with senses (new lexicon name is the
	                        parameter) (default: None)
	  --topic_threshold TOPICTHRESHOLD
	                        sets the threshold to use for a csv topicfile
	                        (default: None)
	  -a, --add_terms       Adds terms from the loaded lexicon to a given corpus
	                        (options below) (default: False)
	  -l, --corpus_lexicon  Load a lexicon based on finding words in a given
	                        corpus (BETA) (options below) (default: False)
	  --corpus_examples     Find example instances of words in the given corpus
	                        (using rlike; equal number for all words) (default:
	                        False)
	  --corpus_samples      Find sample of matches for lexicon. (default: False)
	  -e, --expand_lexicon  Expands the lexicon to more terms. (default: False)

	Terms OR Corpus Lexicon Options:

	  -d DB, --corpus_db DB
	                        Corpus database to use [default: dla_tutorial]
	  -t TABLE, --corpus_table TABLE
	                        Corpus table to use [default: msgs]
	  --lexicondb DB        The database which stores all lexicons. (default:
	                        dlatk_lexica)
	  --corpus_term_field FIELD 
	                        field of the corpus table that contains terms (lexicon
	                        table always uses 'term') [default: term]
	  --corpus_message_field FIELD
	                        field of the corpus table that contains the actual
	                        message [default: message]
	  --corpus_messageid_field FIELD
	                        field of the table that contains message ids (set to
	                        '' to not use group by [default: message_id]
	  --min_word_freq NUM   minimum number of instances to include in lexicon (-l
	                        option) [default: 1000]
	  --lexicon_category CATEGORY
	                        category in lexicon to get random samples from
	                        (default: None)
	  --num_rand_messages NUM   
	                        number of random messages to select when getting
	                        samples from lexicon category (default: 100)


Example Commands
================

Upload the topic given word probability distributions generated during LDA. This creates a table in `dlatk_lexica` called `msgs_lda_cp`.

.. code-block:: bash

	dlatkInterface.py --lex_interface --topic_csv  \ 
	--topicfile=/home/user/lda_tutorial/msgs_lda_tok_lda.lda_topics.topicGivenWord.csv  \ 
	-c msgs_lda_cp