About MEH

The Meaning Extraction Helper (MEH) was created by Ryan L. Boyd in order to facilitate the speedy, accurate, and clean engagement in the meaning extraction method, which was first introduced in 2008 by Dr. Cindy K. Chung and Dr. James W. Pennebaker. Since then, the software has been expanded considerably to include things like term frequency-inverse document frequency data, output that can be used for other topic models (like LDA), and even semantic network graphs and other forms of distributed semantic representations.

Comments, questions, and feedback are always welcome!

The proper citation for the Meaning Extraction Helper is:
Boyd, R. L. (2017). MEH: Meaning Extraction Helper (Version 1.4.20) [Software]. Available from https://meh.ryanb.cc

 

Dedication

This software is dedicated to the memory of Jennifer J. Malin. If you find the Meaning Extraction Helper of use in your work, please consider donating to the Jennifer Malin Endowment.

Contributors

If you would like to contribute to the Meaning Extraction Project, please send an e-mail to Ryan L. Boyd. The following individuals have helped to contribute to this project in various capacities, and are due proper thanks:

Kate Blackburn
Cindy K. Chung
Elif Ikizer
Paola Pasca
James W. Pennebaker
Nairán Ramírez-Esparza

 

Changelog:

2017-12-11 – Uploaded version 1.4.20 and portable version. Fixed a bug with the “conversions” feature that would, depending on setup, cause the software to incorrectly perform conversions within words as well as on whole words. This version should fix the problem. Special thanks to Caroline Hamilton for the heads-up!

2017-05-04 – Uploaded version 1.4.15 and portable version. This latest version is built under .NET framework 4.6.1. In preliminary tests, this new version is slightly more efficient, but there shouldn’t be any obvious differences as a function of the framework. This was done primarily to keep everything relatively up to date with newer systems. Additionally, it is now possible to analyze texts for up to 10-grams. Keep in mind that this is only feasible on systems with a lot of RAM. If analyzing texts for a large “n” in your n-grams, you’ll want to make sure that you take full advantage of the “Prune Low Baserate N-Grams” feature of the software.

2016-07-19 – Uploaded version 1.4.14 and portable version. Removed the “e-mail when completed” feature, as nobody ever used it. Added the ability to choose how many processors to use for frequency list creation.

2016-04-27 – Uploaded 1.4.13 and portable version. Added the “.dic” extension to dictionary files instead of “.txt”. This makes the dictionary files compatible with LIWC2015 “out of the box”. Added the word “kinnen” to the default German stoplist (thanks to Thomas Zapf-Schramm). Removed an old debug conditional that was accidentally retained from a previous version. The bug did not affect output, but would raise “True” message boxes when the word “grandma” was included in the dictionary word list and certain other conditions were met. Random, I know — it’s been long enough to where I don’t even remember why that was the word used for debugging. Also, updated the splash screen to reflect the correct copyright dates (2016 is now included).

2016-04-26 – Uploaded 1.4.12 and portable version. Implemented a new system to output the final datasets for MEH (e.g., binary, verbose, etc.). This method keeps file streams open while rescanning for frequencies. This method implements a far more efficient use of the hard drive and resources. This stage of the analysis should now be much, much faster, particularly for massive datasets.

2016-04-25 – Uploaded 1.4.10 and portable version. Testing a new system for the creation of term frequency files that is parallelized. Preliminary tests for this version show that this step can run up to 75% faster than the previous versions. I’m not sure how stable this implementation will be, so please keep me notified of any bugs that you encounter.

2016-04-14 – Uploaded 1.4.08 and portable version. Fixed a bug where DTM and TF-IDF output would be incorrect when using the “European Format” output option. Special thanks to Thomas Zapf-Schramm for bringing this bug to my attention and figuring out the cause.

2016-03-29 – Uploaded 1.4.07 and portable version. Fixed a bug in the MEH Wizard that would not scan subdirectories within the Wizard when it had been explicitly requested by the user.

2016-03-01 – Uploaded 1.4.06 and portable version. Fixed a bug in the MEH Wizard that would select “Scan Subfolders” regardless of what the user chose.

2015-12-13 – Uploaded 1.4.05 and portable version. Fixed a bug that prevented the standard frequency outputs (i.e., binary, verbose, dtm, etc.) from being properly created under certain conditions. It us not yet certain if this bug will persist under other conditions, but testing will happen soon to ensure that everything works as expected. Fixed many more small UI bugs, fixed a bug that caused frequency outputs to be run extremely slowly when “skip-gram” network edge detection was being run.

2015-12-12 – Uploaded 1.4.04 and portable version. Fixed a minor UI bug. In the previous version, MEH would not correctly recognize the “On the Fly folder indexing” option and the “Skip files that cannot be read” option. This has now been fixed.

2015-12-08 – Uploaded 1.4.03 and portable version. Fixed some bugs with the interface that would prevent some options from disabling/re-enabling. Updated the Turkish stop list and conversions list, thanks to Elif Ikizer. Completely reworked the semantic network data engine to include only raw word counts, as well as an optional “skip-gram” style method.

2015-11-23 – Uploaded 1.4.01 and portable version. Added some extra information to the MEH Wizard to help users understand what the software does “out of the box”. Added an R script to run Semantic Network Analyses to the Understanding Output page.

2015-11-20 – Uploaded 1.4.00 and portable version. Added the new “MEH Wizard” to make setup and navigation of the software easier for new users. Somewhere in the past couple of months, the website developed an error that disallowed administration, so I had to scrap the website and reinstall.

2015-09-23 – Uploaded 1.3.08 and portable version. Fixed a couple of mistakes in the default conversions for English. Added a blinking indicator to the Analyze button. Rewrote the conversions engine — the system is now up to 5x as fast. Fixed the “Scan Subfolders” option. Fixed an issue where MEH could still attempt to lemmatize text after selecting an unsupported language.

2015-09-21 – Uploaded 1.3.07 and portable version. Fixed an issue where file headers were being detached from the rest of the data.

2015-09-09 – Uploaded 1.3.06 and portable version. Finally solved the 32-bit vs 64-bit problem… hopefully. Special thanks to Kate Blackburn, Sanaz Talaifar, Anastasia Rigney, and Skylar Brannon for their help in getting to the bottom of this.

2015-09-02 – Rebuilt x64 versions of MEH 1.3.05. Hopefully, this will fix the issue that prevents the x64 version from successfully running on x64 PCs.

2015-09-01 – Uploaded version 1.3.05 and variants. Fixed a number of small interface bugs (e.g., inability to disable lemmatization, certain options becoming inaccessible after processing files). Added a feature that allows the user to process their stop list prior to conversions. Added a number of additional catches and fixes for when common memory issues arise.

2015-08-28 – Uploaded version 1.3.04 and portable version. Fixed a bug with the conversion system that would miss multi-word phrases.

2015-08-27 – Uploaded version 1.3.03 and portable version. Fixed an issue with the new frequency list generation code. Previous version would merge the first two lines of output together. Previous versions could also run into memory issues when processing datasets with extremely large numbers of n-grams. On 64-bit machines, this is now limited by how much data MEH can hold in your system’s memory, rather than a hard ~1.5 GB limit. This ~1.5 GB RAM limit is still in place on 32-bit machines however. Also, added the ability to rely exclusively on the Dictionary List, ignoring any n-grams that aren’t explicitly in this list.

2015-08-26 – Uploaded version 1.3.02 and portable version. Changed how the Frequency List is written to a file. The new way should minimize memory issues, particularly for large datasets and datasets with a lot of unique words. If you’ve experienced errors for no obvious reason while MEH is trying to write your Frequency List, this was likely the cause. Updating to the latest version is recommended.

2015-08-24 – Uploaded version 1.3.01 and portable version. Fixed a bug where MEH was not actually changing the file encoding to what users selected from the dropdown menu. This is now resolved.

2015-08-13 – Uploaded version 1.3.0 and portable version. Complete interface overhaul — MEH is now considerably less cluttered than the previous version. Menus are now used for options, meaning that option settings can be reviewed during processing. Minor bug fixes. Added the “Dictionary Words” box and scrapped the older “Dictionary Mode”. The new interface has been tested pretty thoroughly. However, if you find any bugs or mistakes, please send me an e-mail.

2015-08-12 – Uploaded version 1.2.724 and portable version. Added all possible system encodings — MEH can now process almost any text file with any encoding. Added a splash screen. Thinking about ways to redesign the program layout so that it’s less cluttered and clunky looking.

2015-08-03 – Uploaded version 1.2.723 and portable version. Heavily modified the “create dictionary” engine. This new process is about 25% faster than previous versions. Note that this is still the slowest part of the entire application.

2015-07-21 – Uploaded version 1.2.722 and portable version. Changed data types used by MEH when compiling a combined frequency list. This process is now around 8x faster than previous versions (give or take), particularly when dealing with very large numbers of files in big datasets.

2015-07-20 – Uploaded version 1.2.721 and portable version. Changed word handling from uppercase to lowercase. This provides a small additional gain in performance, particularly when lemmatization is used. Output is now also in lowercase.

2015-07-19 – Uploaded version 1.2.72 and portable version. Lots of new changes and additions. The primary processing engine has had some small changes made to it in how it handles and searches for words as they are recognized. This process runs at essentially the same speed as before for smaller files, but exhibits speed gains of anywhere between about 3% and 15%+ for larger files. Added multiple segmentation options, including the use of Regular Expressions to segment incoming files. This is particularly useful for creating semantic network data, or splitting files by linebreaks. Changed how different pieces of data are handled throughout the process, which should lead to some more efficient file reading and string processing across the board.

2015-06-08 – Uploaded version 1.2.718 and portable version. Added the “Prune Low Baserate N-Grams” feature to the options menu.

2015-06-03 – Uploaded version 1.2.717 & portable version. Multiple users on Windows 8.1 have reported that MEH stopped working on their newer systems. This version is an update that should fix this issue. Please let me know if you continue to have issues with this update.

2015-05-21 – Uploaded version 1.2.716 & portable version. Fixed a bug that prevented aggregated network data from being output if any one of the four output types (binary, verbose, etc.) was not selected.

2015-05-08 – Uploaded version 1.2.715 & portable version. Reworked some of the network data generation algorithms for the file-by-file code. This new approach should use considerably less memory and run much more quickly. Added the word “thing” to the default English stop list. Changed a few option defaults.

2015-05-08 – Uploaded version 1.2.714 & portable version. The only change is that this version is built on .NET 4.5.2. Some users have been having issues with previous builds running on Windows 8 — this update should help this issue. Windows 7 users should ensure that they have this latest .NET version installed.

2015-05-06 – Uploaded version 1.2.713 & portable version. Fixed some minor bugs / typos in the code, but nothing that should result in different output than previous versions. Added the ability to generate individual network data for each file, including when segmentation is used. As before, this is all experimental and needs to be thoroughly checked.

2015-05-02 – Uploaded version 1.2.711 & portable version. Fixed a couple of minor bugs with the edge/node system. Cancellation should now work prior to matrix construction (not during yet — this will be added in the near future).

2015-05-02 – Uploaded version 1.2.710 & portable version. Changed the layout of the options forms. Changed the names of output files to be a bit more brief. Added a whole mess of output options in the form of weighted co-occurrence matrices derived from the standard output. These new output files are intended for use with semantic network analyses and software such as Gephi. These output files and format are experimental right now.

2015-04-30 – Uploaded version 1.2.705 & portable version. Fixed a ton of little bugs that have been pestering me for a while. Fixed a bug where the options form would reset if users made changes then opted to work from a pre-existing TF folder / frequency list. Fixed a potential bug for the progress bar to cause an error message. Fixed a problem where filenames that began or ended with whitespace would cause an error. Additionally, I have rewritten the frequency table building process — this process no longer relies on a binary serializer as it was running into problems with larger files. This new system should also help minimize read/write errors during the term frequency combination process. This new system should also be more RAM-friendly, as it no longer requires two entire subsets to be loaded into RAM at the same time in order to combine them after the first pass.

2015-04-28 – Uploaded version 1.2.703 & portable version. Added an option that allows the user to specify the number of decimal places they would like for verbose and tf-idf outputs. Smaller values will lose a small amount of precision (e.g., .00054 versus .0005432186), however, the data files will take up considerably less hard drive space.

2015-04-27 – Uploaded version 1.2.702 and the portable variant. First, removed an annoying message that would pop up while building a frequency list for every word — a holdover from the debugging process that was improperly removed. Second, added the option to get tf-idf output directly from MEH rather than having to manually calculate this information.

2015-04-26 – Uploaded version 1.2.7 and the portable variant. Numerous changes have been made. An option has been included to skip over read errors for datasets that might have multiple corrupt files. Several parts of the frequency list generation have been improved to reduce errors. Inverse document frequency (IDF), total number of included observations, and a raw count of observations in which each n-gram occurs are now included in the frequency list output. This new frequency list system will not read older frequency lists unless modified such that the header looks like that generated by version 1.2.7. Please e-mail me if you have questions about reusing an old frequency list with this version of MEH.

2015-04-22 – Uploaded version 1.2.67 and the portable variant. Made a slight change that will correctly classify the alphabetical portion of numeric ranks as numbers. If you have been having words like “st”, “th”, and “nd” show up in your frequency list, these are likely the result of n-grams in your data like “1st”, “8th”, and “2nd” — the numeric component was cut off and correctly classifed, but not the remaining part of the word. This has been fixed in this current version — any numeric characters that are followed by letters are now treated entirely as a number.

2015-04-08 – Uploaded version 1.2.66 and the portable variant. Special thanks to Elif Ikizer for providing a Turkish conversion list and stop list. I have also decided to stop creating .NET 4.0 builds for Windows XP, as this takes precious additional time and I am not aware of any users who currently run Windows XP.

2015-02-10 – Uploaded version 1.2.65 and its variants. Added a “dictionary mode” that can be used to specify the specific N-grams that you would like MEH to search for in your text files. Moved the location of the “use pre-existing TDF and Frequency List” option to a more appropriate location. Minor changes to the internal formatting of some of the CSV output, should result in identical appearance.

2015-02-08 – Uploaded version 1.2.603 and its variants. Rewrote a substantial part of the frequency recombination engine. MEH now uses serialized binaries of objects created from each text file rather than writing to, then reading from and parsing, .txt files during this process. This should help to make this system faster and less processor intensive. Added some more words to the default English conversion list. Added an extra column in the frequency list that shows the frequency of n-grams for included observations in addition to the original overall frequency regardless of inclusion. Fixed the dictionary building system — a bug was introduced in a previous update that prevented the dictionary building system from working. This bug was not discovered until recently.

2015-01-31 – Uploaded version 1.2.602 and its variants. Fixed a bug with the “dynamically adjust values” recombination feature that could cause analyses to halt early.

2015-01-30 – Uploaded version 1.2.601 and its variants. Minor tweaks to the term frequency combination engine to help with overall completion speed.

2015-01-26 – Uploaded version 1.2.6 and its variants. A lot of changes, bugfixes, and new features. Updated the default Conversion list and Stop list for English. Updated default lists for Italian (special thanks to Paola Pasca). Fixed a bug where unchecking the [N-X]-gram feature would not actually disable this feature. Fixed a bug where searching for [N-X]-grams would incorrectly overemphasize values for words appearing at the beginning of files, which could result in erroneous frequency lists. Fixed a couple of small UI bugs that did not impact performance. Rewrote the entire engine for the “Building Combined Frequency List” stage of text analysis. This process should now be much, much faster than before. I also added several controls and features that allow users to tweak how this process works. Updated the citation year (this was also updated for a later uploaded of version 1.2.501). Other minor changes made to UI layout. Fixed a bug where cancelling during early stages would still result in MEH trying to sort a frequency list, resulting in a delayed cancellation.

2015-01-10 – Uploaded version 1.2.501 and its variants. Added the “e-mail notification” feature to the options available to users. If selected, this option will send you an e-mail when processing is complete, or when an error is encountered. Particularly useful if you want to know when text analysis has finished for a large dataset.

2014-12-30 – Uploaded version 1.2.5 and its variants. Major additions to this version include “big data” options such as subfolder scanning and on-the-fly folder indexing, as well as other options such [n-x]-grams, the ability to start from previous term-frequency folders to build a new frequency list, and so on. This update constitutes a major update.

2014-11-24 – Uploaded version 1.2.42 and its variants. Minor fixes made to UI code, no change in functionality.

2014-11-18 – Uploaded versions 1.2.41, 1.2.41 XP, and 1.2.41 portable. Fixed a bug that would cause MEH to rescan files for frequencies when these options were not explicitly selected, but the “build dictionary” option was selected.

2014-11-16 – Uploaded versions 1.2.4, 1.2.4 XP, and 1.2.4 portable. Fixed a bug that was preventing the “Use Existing Frequency List” feature from working correctly with the updated engine.

2014-11-13 – Uploaded versions 1.2.3, 1.2.3 XP, and 1.2.3 portable. More core changes to the engine. The new engine creates specific “NGram” objects that are handled more efficiently than before. Another extremely important point of speed increase occurs during the “sorting frequency list” and “observing observation percentages” phases of the procedure. Extremely large frequency lists from large datasets could often take many hours, and sometimes more than a day, to completely sort. New sorting procedures were implemented that are thousands of times faster than before. For example, a dataset that took MEH ~36 hours to sort the frequency list before now takes MEH less than a minute to properly sort.

2014-11-07 – Uploaded versions 1.2.0, 1.2.0 XP, and 1.2.0 portable. This release constitutes a major update. A complete list of changes includes:

  • Minor changes made to fix a couple of issues that could lead to imprecision in rare cases
  • A total engine rewrite for 2/3 of the program. Various benchmarking suggests that the time to complete a frequency analysis + rescan to create datasets is around 4x faster as a result of the engine rewrite. For larger datasets, speed increases may range from approximately 4x to 40+x than the previous version, depending on the size and nature of the dataset.
  • MEH now generates term frequencies for each document and stores them in your output folder — it reuses this information for considerable speed gains. This also allows users to take this data and apply it in any way they see fit.
  • MEH now allows you to choose the text encoding (for both input and output) that you would like to use. Currently, you can select from your system’s default (recommended in most cases), ASCII, and UTF-8. Other common encodings can be added by request.

2014-10-08 – Uploaded versions 1.1.2, 1.1.2 XP, and 1.1.2 portable. Changed the encoding used when reading in files from system default to UTF-8. This will allow users to analyze text files that have a Unicode encoding without problems, even when using an ANSI/ASCII default computer.

2014-09-18 – Uploaded version 1.1.11, 1.1.11 XP, and 1.1.11 portable. Two minor changes. I updated the “tab indexes” so that the “Tab” key navigates the interface more smoothly. I also added a very small number of default conversions that convert standard British spellings to United States spellings (e.g., “behaviour” to “behavior”). Americentric, I know — I am happy to remove these defaults if people disagree with their inclusion. Just let me know.

2014-09-13 – Uploaded version 1.1.1, 1.1.1 XP, and 1.1.1 portable. Made some significant changes in order to continue pushing MEH further in its “big data” capabilities. The “Files to Investigate” box is no longer present, as this feature was generally not useful when scanning huge folders full of .txt files. Small engine changes were made to allow better acquisition of file information as well as the capability to hold much more file information in memory. Before, MEH would choke on large numbers of files ( > 500,000, although exact numbers aren’t known). MEH 1.1.1 has been tested and confirmed working with a folder of 2.5 million files, although the current upper limit has not been tested.

2014-08-10 – Uploaded version 1.1.0, 1.1.0 XP, and 1.1.0 portable. Made changes to the engine that improve speed when rescanning for frequencies. Speed gains may be anywhere from trivial to around 20-25% during the rescanning phase depending on multiple factors such as word counts and the “n” selected for n-grams.

2014-07-05 – Uploaded version 1.0.91, 1.0.91 XP, and 1.0.91 portable. Fixed two small bugs that shouldn’t affect functionality for most users. One bug involved situations where only one word was extracted from an entire corpus that would cause text analysis to halt early. The second bug involved non-ASCII apostrophes and would allow partial words such as don(‘t) and wasn(‘t) to persist through analyses.

2014-06-17 – Uploaded version 1.0.9, 1.0.9 XP, and 1.0.9 portable. This new version is much more light on its feet, and no longer loads all text into memory prior to analysis. Rather, 1.0.9 reads text in on an “as needed” basis. This not only makes MEH incredibly less greedy with system resources, but also allows for the analysis of extremely large corpora. Essentially, the size of the corpus that can be processed by MEH is no longer limited by whether you can fit the entire thing into RAM all at once.

2014-06-16 – Added a link to tutorial slides on this page.

2014-05-28 – Uploaded version 1.0.81, 1.0.81 XP, and 1.0.81 portable. Fixed a minor bug introduced in the “Output Options” button. This would prevent full output from being generated if you had previously decided to only generate the frequency list.

2014-05-28 – Uploaded version 1.0.8, 1.0.8 XP, and 1.0.8 portable. Added “Output Options” button. This allows you to specify which types of output you would like; this helps to avoid unnecessary clutter and save some disk space when you don’t want all three primary types of output (Binary, Verbose, and DTM).

2014-05-27 – Uploaded version 1.0.7, 1.0.7 XP, and 1.0.7 portable. Added the “document term matrix” output to the default MEH output. This will expands MEH’s usefulness to analyses such as LDA.

2014-03-31 – Uploaded version 1.0.6 and 1.0.6 XP (not a portable version yet, but that will come soon). Increased the amount of text that can be placed in the “Conversion” and “Stop List” boxes. This allows for much, much larger lists that may come in useful if you have a massive list of conversions that you want to use (for example, if you want to perform lemmatization manually for an unsupported language).

2014-02-26 – Uploaded version 1.0.55 (and XP and portable versions). Updated the default stop list for Spanish words. Special thanks to Nairán Ramírez-Esparza for this stop list!

2014-02-25 – Uploaded version 1.0.54. Minor update to the “European Output” option. This fix should now generate output that is in full compliance with this format. Please let me know if this is not the case.

2014-01-12 – Uploaded version 1.0.53. Minor updates to default conversion list. Fixed a small bug that could cause 2+-grams to halt while rescanning for frequencies. Added a Windows XP version.

2013-11-17 – Uploaded version 1.0.52. Minor updates to English stop list. No change in functionality.

2013-11-14 – Uploaded version 1.0.51. Added a word to the default English Conversions list. Changed tab index order for better navigation.

2013-11-13 – Uploaded version 1.0.5. Frequency list now contains metadata for convenience, reuse, and reporting. Added the “Use Existing Frequency List” option, allowing users to reuse a previously-generated frequency list, thus bypassing this stage of analysis if desired. Greatly improved sorting algorithms for the “Sort Freq / Dictionary Output” option — this saves a very large amount of time and processing power, especially for larger projects. Improved algorithm for dictionary construction, which shaves some extra processing time off of this procedure.

2013-10-24 – Uploaded version 1.0.2. A lot of major changes. Output is now printed in an iterative fashion to conserve system resources. Minimum observation percentage is now used rather than “Scan for X number of N-Grams” for better precision and less work for the user. Other small changes implemented for resource conservation and simplicity, some text changed for better usability. Removed some words from the default English stop words list that would usually be of interest to researchers.

2013-10-16 – Uploaded version 1.0.1. Fixed an issue where qqNUMBERqq was being ignored as a stop list entry when lemmatization was disabled.

2013-10-15 – Uploaded version 1.0.0. Updated MEH to .NET version 4.5. MEH has access to a greater amount of RAM, allowing for the processing of much larger quantities of text.

2013-10-12 – Uploaded version 0.9.987. Fixed a bug that would overload the building of custom dictionary files during reconstrual. Updating to this latest version is recommended.

2013-10-12 – Uploaded version 0.9.986. Additional default English conversions added, most of which are common misspellings.

2013-10-11 – Uploaded version 0.9.985. Custom dictionary file is now reconstrued to alphabetical order for easier reading / editing. Added more default conversions for the English language.

2013-10-11 – Uploaded version 0.9.98. Built in a safeguard that prevents the possibility of users”overscanning” text when building a custom dictionary file by trying to build a dictionary file from more N-grams than what actually exists in the text being processed.

2013-10-07 – Uploaded version 0.9.97. This version is able to build a custom dictionary file (compatible with RIOT Scan), derived from the text being processed. This feature is useful for combining word categories and applying themes to new samples of text.

2013-10-01 – Uploaded version 0.9.96. Slight change in interface updating and a small efficiency tweak. No changes in functionality whatsoever.

2013-09-27 – Uploaded version 0.9.95. Major overhaul of the conversion system. It is now much more flexible and should work for multiple languages. Implemented a feature to create output in either standard or European .csv format.

2013-09-27 – Uploaded version 0.9.9. Added a few more stop lists for other languages.

2013-09-26 – Uploaded version 0.9.8. Implemented support for multiple language lemmatization. Added default stop word lists for all corresponding languages (lists derived from ranks.nl and other places). Languages that require unicode encoding (such as Russian and Bulgarian) need to be thoroughly tested by someone who is fluent in these languages. I am hoping to eventually add better stop word lists for all supported languages.

2013-09-18 – Uploaded version 0.9.7. Implemented N-gram features across the board. Moderate interface redesign for better tracking / useability.

2013-09-18 – Uploaded version 0.9.6. Fixed a small bug that could, in rare circumstances, cause an overly conservative word search.

2013-09-17 – Uploaded version 0.9.5. Files on the desktop are now much more easily accessible.

2013-09-13 – Uploaded the first version of this page, uploaded MEH 0.9.4. Special thanks to Dr. Cindy Chung for her guidance in performing the Meaning Extraction Method.