Megam-Teaching
Category
Other
Program On
Teaching
Version
0.92
Author / Distributor
Please see http://users.umiacs.umd.edu/~hal/megam/
Description
From http://users.umiacs.umd.edu/~hal/megam/: "MEGA Model Optimization Package: Maximum entropy models are very popular, especially in natural language processing. The software here is an implementation of maximum likelihood and maximum a posterior optimization of the parameters of these models."
Running Program
Also refer to Running Jobs on the teaching cluster
- Version 0.92, compiled with OCaml 4.07, installed in /usr/local/apps/gb/megam/0.92-foss-2018a
To use this version of megam, please first load the module with
ml megam/0.92-foss-2018a
This module will automatically load the foss/2018a toolchain.
Sample job submission script (sub.sh) to run megam:
#!/bin/bash
#SBATCH --job-name=megamjob
#SBATCH --partition=batch
#SBATCH --mail-type=ALL
#SBATCH --mail-user=username@uga.edu
#SBATCH --ntasks=1
#SBATCH --mem=1gb
#SBATCH --time=08:00:00
#SBATCH --output=megamjob.%j.out
#SBATCH --error=megamjob.%j.err
cd $SLURM_SUBMIT_DIR
ml megam/0.92-foss-2018a
megam [options] <model-type> <input-file>
where [options] <model-type> <input-file> need to be replaced by the options, the model and the input file that you want to use.
In the real submission script, at least all the above underlined values need to be reviewed or to be replaced by the proper values.
Sample job submission command:
sbatch sub.sh
Documentation
Please see http://users.umiacs.umd.edu/~hal/megam/
ml megam/0.92-foss-2018a
megam -h
usage: megam [options] <model-type> <input-file>
[options] are any of:
-filespec treat <input-file> as a filespec, not a normal
input file
-fvals data is in not in bernoulli format (i.e. feature
values appear next to their features int the file)
-explicit this specifies that each class gets its own feature
vector explicitely, rather than using the same one
independently of class
(only valid for multiclass problems)
-quiet don't generate per-iteration output
-maxi <int> specify the maximum number of iterations
(default: 100 for maxent, 20 for perceptron)
-dpp <float> specify the minimum change in perplexity
(default: -99999)
-memory <int> specify the memory size for LM-BFGS (multiclass only)
(default: 5)
-lambda <float> specify the precision of the Gaussian prior for maxent;
or the value for C for passive-aggressive algorithms
(default: 1)
-tune tune lambda using repeated optimizations (starts with
specified -lambda value and drops by half each time
until optimal dev error rate is achieved)
-sprog <prog> for density estimation problems, specify the program
that will generate samples for us (see also -sfile)
-sfile <files> for de problems, instead of -sprog, just read from a
(set of) file(s); specify as file1:file2:...:fileN
-sargs <string> set the arguments for -sprog; default ""
-sspec <i,i,i> set the <burn-in time, number of samples, sample space>
parameters; default: 1000,500,50
-sforcef <file> include features listed in <file> in feature vectors
(even if they don't exist in the training data)
-predict <file> load parameters from <file> and do prediction
(will not optimize a model)
-mean <file> the Gaussian prior typically assumes mu=0 for all features;
you can instead list means in <file> in the same format
as is output by this program (baseline adaptation)
-init <file> initialized weights as in <file>
-minfc <int> remove all features with frequency <= <int>
-bnsfs <int> keep only <int> features by the BNS selection criteria
-abffs <int> use approximate Bayes factor feature selection; add features
in batches of (at most) <int> size
-curve <spec> produce a learning curve, where spec = "min,step"
and we start with min examples and increase (multiply!)
by step each time; eg: -curve 2,2
-nobias do not use the bias features
-repeat <int> repeat optimization <int> times (sometimes useful because
bfgs thinks it converges before it actually does)
-lastweight if there is DEV data, we will by default output the best
weight vector; use -lastweight to get the last one
-multilabel for multiclass problems, optimize a weighted multiclass
problem; labels should be of the form "c1:c2:c3:...:cN"
where there are N classes and ci is the cost for
predicting class i... if ci is 'x' then it forbids this
class from being correct (even during test)
-bitvec optimize bit vectors; implies multilabel input format,
but costs must be 0, 1 or 'x'
-nc use named classes (rather than numbered) for multi{class,tron};
incompatible with -multilabel
-pa perform passive-aggressive updates (should be used only
with perceptron or multitron inference; also see -lambda)
-kernel <spec> perform kernel mapping; <spec> should be one of:
'#:linear', '#:poly:#', or '#:rbf:#'; the first # should
be the desired dimensionality; the # for poly
is the degree and the # for rbf is the width of the
Gaussian. Any of these options can be followed by
':dist' (to select by distance) or ':class' (to select
by class)
-norm[1|2] l1 (or l2) normalization on instances
<model-type> is one of:
binary this is a binary classification problem; classes
are determined at a threshold of 0.5 (anything
less is negative class, anything greater is positive)
perceptron binary classification with averaged perceptron
multitron multiclass classification with averaged perceptron
binomial this is a binomial problem; all values should be
in the range [0,1]
multiclass this is a multiclass problem; classes should be
numbered [0, 1, 2, ...]; anything < 0 is mapped
to class 0
density this is a density estimation problem and thus the
partition function must be calculated through samples
(must use -sprog or -sfile arguments, above)
Fatal error: exception Failure("Error: not enough arguments")
Installation
Version 0.92, downloaded from http://hal3.name/megam/megam_src.tgz
System
64-bit Linux