KeMontielOleaNesbit2024
Documentation for KeMontielOleaNesbit2024.
KeMontielOleaNesbit2024.RawDocs
KeMontielOleaNesbit2024.algo1_only_store_draws
KeMontielOleaNesbit2024.algo2_range
KeMontielOleaNesbit2024.compute_functional_from_nmf_draws
KeMontielOleaNesbit2024.credible_set90_range
KeMontielOleaNesbit2024.do_plots
KeMontielOleaNesbit2024.find_NMF_given_solution
KeMontielOleaNesbit2024.gen_NMF
KeMontielOleaNesbit2024.generate_tf_only_matrix
KeMontielOleaNesbit2024.plot_word_cloud
KeMontielOleaNesbit2024.preprocess
KeMontielOleaNesbit2024.run
KeMontielOleaNesbit2024.run_tests
KeMontielOleaNesbit2024.simulation_plots
KeMontielOleaNesbit2024.vb_estimate
KeMontielOleaNesbit2024.RawDocs
— TypeRawDocs
Structure to hold document data, including tokens, stems, and metadata.
Fields:
docs
: raw document stringstokens
: tokenized wordsstems
: stemmed tokenssw_set
: set of stopwords- ...
KeMontielOleaNesbit2024.algo1_only_store_draws
— Methodalgo1_only_store_draws(gamma1, lam1, gamma2, lam2, eps, T, save_folder; post_draw_num, beta, random_seed)
Draws posterior samples for B and Θ from two gamma/lambda priors, solves NMF, and stores the outputs as .jld2
files.
KeMontielOleaNesbit2024.algo2_range
— Methodalgo2_range(N, I_sim)
Simulates and plots the approximation quality of Algorithm 2 over a range of possible n11
values.
KeMontielOleaNesbit2024.compute_functional_from_nmf_draws
— Methodcompute_functional_from_nmf_draws(FOMC_sec, func, prior_post_draw_name, NMF_draw_folder_name)
Computes a statistic over the posterior NMF draws for a specific FOMC section.
Arguments
FOMC_sec
: 1 or 2func
: a function (e.g. HHIpercentdiff) to apply on Herfindahl indicesprior_post_draw_name
: path to posterior draw.jld2
fileNMF_draw_folder_name
: path to folder with NMF iteration draws
Returns
H_diff_percent
: average percent change for each drawlambda_lower_percent
,lambda_upper_percent
: lower/upper bounds from NMF path draws
KeMontielOleaNesbit2024.credible_set90_range
— Methodcredible_set90_range(N, I_sim)
Generates 90% robust credible intervals for the function lambda_example
using simulation draws.
KeMontielOleaNesbit2024.do_plots
— Methoddo_plots()
Generates all plots related to NMF posterior means and functional measures. Assumes that NMF output has already been generated and saved.
KeMontielOleaNesbit2024.find_NMF_given_solution
— Methodfind_NMF_given_solution(B_init, Theta_init, beta, T, eps; maxit, verbose, random_seed)
Solves a posterior NMF decomposition by iteratively applying Algorithm 1.
Returns
- A tuple of lists:
(B_list, Theta_list)
with matrices from each iteration.
KeMontielOleaNesbit2024.gen_NMF
— Methodgen_NMF()
Runs the full NMF draw generation pipeline. It:
- Estimates OnlineLDA for FOMC1 and FOMC2
- Draws posterior B and Θ samples
- Applies NMF
- Saves all outputs into the
NMF_draws_folder
Includes both posterior and prior-based NMF draw scenarios.
KeMontielOleaNesbit2024.generate_tf_only_matrix
— Functiongenerate_tf_only_matrix(tf_idf_threshold::Vector{Int}, additional_stop_words::Vector{String}, option)
Generates term-frequency-only matrices for each FOMC section and saves them as Excel and JSON files.
Arguments
tf_idf_threshold
: max number of words to retain per sectionadditional_stop_words
: extra stopwords to excludeoption
: return "matrix", "text", or nothing
Returns
- Depending on
option
, returns term-document matrices or tokenized meeting texts
KeMontielOleaNesbit2024.plot_word_cloud
— Methodplot_word_cloud(text::Vector{Vector{String}}, filename::String)
Generates and saves a word cloud plot using all tokens from the provided text.
Arguments
text
: nested vector of tokenized words (one subvector per document)filename
: file name to save the output PNG plot underPLOT_PATH
KeMontielOleaNesbit2024.preprocess
— Methodpreprocess()
Runs the main preprocessing pipeline for FOMC data:
- Loads raw data
- Applies speaker-based separation
- Tokenizes and stems the content
- Finds collocations
- Outputs cleaned data to Excel
KeMontielOleaNesbit2024.run
— Methodrun()
Runs the full project pipeline:
- Extracts and preprocesses FOMC meeting transcripts,
- Generates TF-only matrices,
- Creates word clouds,
- Performs variational Bayes topic modeling using OnlineLDA,
- Generates and saves NMF posterior draws,
- Plots results and simulation outputs.
Outputs are saved to configured CACHE_PATH
, MATRIX_PATH
, and PLOT_PATH
.
KeMontielOleaNesbit2024.run_tests
— Methodrun_tests()
Executes the complete test suite from the /test
directory using included unit and integration tests.
KeMontielOleaNesbit2024.simulation_plots
— Methodsimulation_plots()
Runs all simulation visualizations:
- Algorithm 2 range plots
- Robust credible set plots
- Monte Carlo illustrations
Each plot is saved as a PNG in PLOT_PATH
.
KeMontielOleaNesbit2024.vb_estimate
— Methodvb_estimate(section::String; onlyTF, K, alpha, eta, tau, kappa, docs_idx_list, random_seed)
Performs OnlineLDA variational Bayes estimation on preprocessed text.
Arguments
section
: "FOMC1" or "FOMC2"onlyTF
: whether to use TF-only dictionaryK
: number of topicsalpha
,eta
: Dirichlet prior parameterstau
,kappa
: learning rate paramsdocs_idx_list
: optional subset of documentsrandom_seed
: reproducibility
Returns
herfindahl
: vector of Herfindahl indices per documentposterior_mean
: normalized gamma matrixgamma
: document-topic matrixlambda
: topic-word matrixmodel
: fitted OnlineLDA instancetext1
: input corpus as a vector of strings